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ABSTRACT: Computational theories of structure from motion [Ullman, 1979] and stereo vision 
[Marr and Poggio, 1979] only specify die computation of three-dimensional surface information at 
special points in die image. Yet, the visual perception is clearly of complete surfaces. In order to 
account for this, a computational theory of the interpolation of surfaces from visual information is 

presented. 

The problem is constrained by the fact that die surface must agree with die information from 
stereo or motion correspondence, and not vary radically between these points. Using the image 
in adiance equation [Horn, 1977], an explicit form of this surface consistency constraint can be derived 

IGrimson. 1981c]. 

To determine which of two possible surfaces is more consistent with die surface consistency 
constraint, one must be able to compare the two surfaces. To do this, a functional from foe space 
uf functions to the real numbers is required. In this way, the surface most consistent with the visual 
information will be that which minimizes the functional. To ensure that die functional has a unique 
minimal surface, conditions on foe form of the functional arc derived. In particular, if foe functional 
is a complete semi-norm which satisfies the parallelogram law, or the space of functions is a semi- 
Hilbert space and the functional is a semi-inner product, dien there is a unique (to within an clement 
uf die null space of the functional) surface which is most consistent with foe visual information. 

It can be shown, based on die above conditions plus a condition of rotational symmetry, diat 
there is a vector space of possible functional which measure surface consistency, fois vector space 
being spanned by the functional of quadratic variation and the functional of square Laplacian [Brady 
and Horn, 1981]. Arguments based on the null spaces of the respective functional are used to justify 
the choice of the quadratic variation as the optimal functional. 

Algorithms for computing the surface which minimizes quadratic variation in die case of exact 
surface interpolation and in foe case of surface approximation are outlined and illustrated on a series 
of synthetic and actual surface interpolation examples. 
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Although our world has three spatial dimensions, the projection of light rays onto the retina 
presents our visual system with an image of the world that is inherently two-dimensional. We must 
use such images to physically interact with this three-dimensional world, even in situations new to 
us, or with objects unknown to us. That we do so easily implies that one of the functions of the 
human visual system is to reconstruct a three-dimensional representation of the world from its two- 
dimensional projection onto our eyes. 

Methods that could be used to effect this three-dimensional reconstruction include stereo vision 
[Marr and Poggio, 1979; Crimson, 1980, 1981a; Mayhew and Frisby, 1981] and structure from motion 
[Ullman, 1979a]. Both of these methods may be considered as correspondence techniques, since 
they rely on establishing a correspondence between identical items in different images, and using the 
difference in projection of these items to determine surface shape. That is, correspondence methods 
compute surface information by: 

(1) Identifying a location in the physical scene in one image; 

(2) Identifying the corresponding location in a second image; either a second image taken from a 
different viewpoint in space (stereo) or a second image taken from a different viewpoint in time 
(structure from motion); and 

(3) Computing a three-dimensional surface value, representing the distance of the point relative to 
some base point, based on the difference in the positions of die two corresponding points in the 
images. 

Current computational theories of these processes [Marr and Poggio, 1979; Mayhew and Frisby, 
1981; Ullman, 1979a] argue that the correspondence process cannot take place at all points in an 
image. Rather, the first stage of the correspondence process is to derive a symbolic description of 
points in the image at which the irradiance undergoes a significant change [Marr and Hildreth, 1980]. 
This symbolic representation (called the primal sketch [Marr, 1976; Marr and Hildreth, 1980]) forms 
the input to the second stage of the process in which the actual correspondence is computed. As a 
consequence of the form of the input, the correspondence process can compute explicit surface infor- 
mation only at scattered points in the image. Yet our perception is clearly of complete surfaces. (For 
example, in Figure I, a sparse random clot stereogram yields the vivid perception of a square floating 
/ - *" v - in space above a background plane, rather than a collection of dots suspended in space.) The problem 
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Figure 1. A Sparse Random Dot Pattern. Although the density of dots is very small, the perception 
obtained upon fusing this pattern is one of two disjoint planes, rather than dots isolated in depth. 



to be addressed in this paper is that of computing complete surface representations, by interpolating 
an initial representation consisting of sparse surface values. 

We will examine this surface interpolation problem at two levels. The first level is to consider 
die strictly mathematical question of surface reconstruction, independent of its relevance to the 
human visual system. Suppose one is given a visual process which determines surface information 
at points corresponding to relevant changes in the images. In general, there will be many possible 
surfaces consistent with these initial surface points. For example, consider the boundary conditions 
provided by a circular arc, along which the depth is constant. The possible surface consistent with 
these known points include a flat disc, a sphere and even the highly convoluted surface formed by a 
radial sine function (see Figure 3). How do we distinguish the correct one? Mathematically, we need 
to be able to compare two possible surfaces, in order to determine which is "better". This can be done 
by defining a functional from the space of surfaces to the real numbers, so that comparing surfaces 
can be accomplished by comparing corresponding real numbers. Provided 0(/) < 0(g) whenever 
surface / is "better" than surface <?, the "best" surface to fit through the known points is that which 
minimizes ©. There arc two problems to solve here: (1 ) What does it mean tov f to he "better" than 
gl and< 2) finder what cofklitions does a unique "best" surface exist? 



Once these questions have been answered and an appropriate functional has been derived, we 
can turn to the second level, which is to consider a specific algorithm for finding the surface that 
optimizes the functional. Because our intent is to consider models for the .interpolation process as it 
occurs in the human visual system, we will restrict our attention to biologically feasible algorithms 
[Ullman, 1979b; Grimson, 1981b]. 

The motivation for considering the interpolation problem first mathematically, independent of 
the specifics of the human system, and then algorithmically, incorporating specific biological con- 
straints, is based on the assumption that one can consider the visual system as a symbol manipulation 
process [Marr 1976, 1981; Marr and Poggio, 1977]. This implies that the meaning of the symbols being 
manipulated can be distinguished from the physical embodiment of those symbols. Hence, one can 
deal with the mathematical consideration of the information processing which is occurring, independ- 
ent of the implementation of that processing (whether in transistors or neurons). The rationale for 
this view lies in the belief that any computational theory should address the fundamental questions 
of the information processing necessary to perform the task, and that such computational theories 
/"""V, are independent, to a large extent, of the method used to compute them. The initial goal is thus to 

determine computational constraints on the interpolation problem, based on the input and output 
representations of the process, and based on the structure of the computation required to transform 
one representation into the other. Note that a computational theory of the information process- 
ing is applicable both to the human visual system, and to applications areas (such as high-altitude 
photomapping, hand-eye coordination systems, industrial robotics, and inspection of manufactured 
parts) where it is useful to create a complete specification of surface shape. 

While we shall initially concentrate on the mathematical aspects of visual surface interpolation, 
the problem is not completely isolated from the human visual system. If we view the human early 
visual system as a symbolic manipulator, we can consider visual processing as a series of transfor- 
mations from one representation to another [Marr, 1976, 1981]. In particular, three stages can be 
identified (sec Figure 2). From the images, one transforms to a description, called the primal sketch, 
of those locations at which the image irradianccs change. Next, primal sketch descriptions of several 
images are matched, either by die stereo or motion computation, to obtain a description of surface 
information at the zero-crossings. This representation is called the raw 2^1) sketch. Finally, the raw 
2 J I) sketch is interpolated to obtain complete surface descriptions, called the full 2^1) sketch |Marr, 
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1978; Man* and Nishihara, 1978]. The first two stages have been considered elsewhere [Marr, t§76\ 
Marr and Hildreth, 1980; Hildreth, 1980; Marr and Poggio, 1979; Crimson, 1980, 1981a, 1981b; 
Ullman, 1979a]. It is the final stage — the problem of surface interpolation — that is considered here, 
to obtain complete surface descriptions, called the full 2^D Sketch [Marr, 1978; Marr and Nishihara, 
1978]. The first two stages have been considered elsewhere [Marr, 1976; Marr and Hildreth, 1980; 
Hildreth, 1980; Marr and Poggio, 1979; Grimson, 1980, 1981a, 1981b; Ullman, 1979a]. It is the final 
stage - the problem of surface interpolation - that is considered here. 

The important point is that the form of the input and output representations can influence the 
design of the transformation. Here, we shall assume that the input representation consists of explicit 
surface information, such as distance or relative distance, along the zero-crossings of the convolved 
image (these terms will be given technical definitions in Section 2). The output representation will be 
a complete specification of surface information, where by complete, we mean that an explicit distance 
value should be computed at every point on some grid representation of the scene. Our main concern 
in this paper is with the computational constraints needed to transform the input representation into 
the output representation. 

Although surface values at all points of the image are important, there is another aspect of 
surface information which should also be made explicit in the output representation. This is the set of 
discontinuities in surfaces; the occluding contours, both subjective and objective. Marr [1978] argues 
that the 2|-D sketch should be a viewer-centered representation which includes both explicit surface 
information, such as depth and surface orientation, and explicit contours of surface discontinuities. In 
this paper, the concentration is on the problem of creating explicit surface information at all points 
of the surface. The question of surface discontinuities will be outlined, and possible algorithms 
suggested, but an implementation of this stage has not been completed. 

2. Consequence of the Correspondence Problem 

We indicated above 'that we would concentrate on correspondence methods which could 
effect the three-dimensional surface reconstruction; stercopsis [Marr, 1980; Marr and Poggio, 1979; 
Grimson, 1980, 1981a] and structure from motion [Ullman, 1979a]. The three main steps of the 
correspondence problem are: (1) identify a location "in the physical scene in one image: (2) identify 
the corresponding location in a second image; cither a second image taken from a •different viewpoint 
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(stereo vision), or a second image taken at a later point in time (motion); and (3) compute a three- 
dimensional surface value, representing the distance of the point relative to some base point, based on 
the difference in the positions of the two corresponding points in the images. 

If one can identify a location beyond doubt in the two images, then the correspondence problem 
is trivial. It can be demonstrated, however, that both the stereo computation and the motion computa- 
tion can take place on very primitive descriptions of the images [Julesz, 1960; Ullman, 1979a]. As 
a consequence, the difficulty of the problem, for human vision, lies in the correspondence problem 
— which item in one image matches which item in the other. The reason for this is that for any primi- 
tive element from one description, there are liable to be many possible matching elements from the 
other description. This is especially true if image irradiance values are used as the basic descriptions. 
Consider an image of a matte-painted wall with uniform illumination. Given a small element of that 
wall from one image, it is virtually impossible to distinguish which small clement from the other view 
matches it. On the other hand, if there is a scratch or texture marking on the wall, it is likely that such 
a location can be distinguished in the two views. This suggests that the representation upon which 
the correspondence operation takes place should reflect those positions in an image at which some 
physical property of the underlying surface is changing. This representation is called the primal sketch 
[Marr, 1976; Marr and Hildreth, 1980]. 

Marr and Hildreth [1980, also Hildreth, 1980] have refined the preceeding intuitive argument 
into more rigorous computational arguments, in conjunction with evidence from neurophysiology and 
psychophysics. They argue that the primal sketch representation is computed by determining those 
locations in an image at which the corresponding surface location undergoes a change in one of its 
physical properties, for example, reflectivity, surface orientation, texture or surface material. Such 
changes will correspond to a step change in image irradiance, at some scale. There are many ways of 
detecting the irradiance changes. Marr and Hildreth argue on various grounds for using the following 
scheme: 

(1) Convolve the image with a set of filters given by the Laplacian applied to a Gaussian, 
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V^(M )= /^2a!L(-^) 

} 

where a is a constant determined from psychophysical data. 



(2) Locate all non-trivial zero- crossings in the convolved irradiances. A non-trivial zero-crossing is a 

point at which the convolved irradiance values change from positive to negative, or vice versa. 
These zero-crossings form the basic representation upon which die later visual processing takes place. 
Given this representation of the images, the correspondence problem can now be solved. Both 
UUman's [1979a] theory of motion perception and Marr and Poggio's [1979] theory of stereo vision 
perform this computation on such primal sketch descriptions. As a consequence, explicit three- 
dimensional surface information (such as distance, or surface orientation) can be computed only at 
points corresponding to zero-crossings in the primal sketch. This would yield a sparse surface repre- 
sentation. Yet clearly, our perception is of complete surfaces (see for example Figure 1). In addition, 
a "nice" boundary is found for the central square. This implies that once the correspondence problem 
is solved, either by the stereo computation or by the motion computation, an interpolation must 
be performed between the surface values given at the zero-crossings, to obtain a complete surface 
description, and surface discontinuities should be explicitly demarked. 

^ 3. The Surface Consistency Constraint 

We now turn to the heart of the matter, the computational constraints involved in the process 
of creating complete surface specifications, by interpolating between known points. We are given 
as basic input to the interpolation process, the zero-crossings of a convolved image, with depth infor- 
mation computed along these zero-crossing contours. Suppose one were to attempt to construct a 
complete surface description based only on the surface information known along the zero-crossings. 
An infinite number of surfaces would consistently fit the boundary conditions provided by these sur- 
face values. Yet there must be some way of deciding which surface, or at least which small family 
of surfaces, could give rise to the zero-crossing descriptions. This means that there must be some 
k additional information available from the visual process which, when taken into account, will identify 

*' a class of nearly indistinguishable surfaces that represent the visible surfaces of the scene. 

In order to determine what information is available from the visual process, one must first 

carefully consider the process by which the zero-crossing contours are generated. The Marr-Hildrcth 

theory or edge detection [Man and Kildreth, 1980; Hildrcth, 1980J relies on the fact that sudden 

changes in the reflectance of a surface, for example, caused by surface scratches or texture markings, 

f"*^ will give rise to zero-crossings in the convolved image. Sudden or sharp changes in orientation or 
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shape of the surface will under most circumstances also give rise to zero-crossings. This fact can be 
used to constrain the possible shapes of surfaces which could give rise to particular surface values 
along zero-crossing contours. 

We illustrate the basic argument with an example. Suppose one is given a closed zero-crossifig 
contour, within which there are no other zero-crossings. An example would be a circular contour, 
along which the depth is constant. There are many surfaces which could fit this set of boundary 
conditions (see Figure 3). One such surface is a flat disk. However, one could also fit other sMbOth 
surfaces to this set of boundary conditions. For example, the highly convoluted surface fbrMed 
by sin f \/z 2 + V 2 ) would be consistent with the known disparity values. Yet in principle, such a 
rapidly varying surface should give rise to other zero-crossings. This follows from the observation that 
if the surface orientation undergoes a periodic variation, then it is likely that the if radiance values 
will also undergo such a variation. Since the only zero-crossings are at the borders of the object, this 
implies that the surface sin (\A 2 + y 2 J is not a valid representative Surface for this set of boundary 
conditions. 

Hence, the hypothesis is that the set of zero-crossing contours contains implicit information 
about the surface as well as explicit information. If one can determine a set of conditions on the 
surface shape that cause inflections in the irradiance values, then one may be able to determine a 
likely surface structure, given a set of boundary conditions along the zero-crbssing contours. 

3.1 No News is Good News 

In general, any one of a multitude of widely varying surfaces could fit the boundary conditions 
imposed along the zero-crossings, To be completely consistent with the imaging pfbeess, however, 
such surfaces must meet both explicit conditions and implicit conditions. The explicit conditions 
are given by the depth or surface orientation values along the zero-crossing contours. The implicit 
conditions are that the surface must not impose any zero-crossing contours other than those which 
appear in the convolved image. This implicit condition leads to the surface consistency constraint, 
namely: 

The absence of zero-crossings const rains the possible "Surface shapes. 

Just as the presence of a zero-crossing tells us that some physical property is changing at a given 
location, the absence of a zero-crossing tells us the opposite, that no physical property is changing. 
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Figure 3. Possible Surfaces Fitting Depth Values at Zero-Crossings. Given boundary conditions 
of a circular zero-crossing contour, along which the depth is constant, there are many possible 
surfaces which could fit the known depth points. Two examples are a flat disk, and the highly 
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oluted surface formed by sinf sjx 1 + y 2 \ shown here. 
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and in particular that the surface topography is not changing in a radical manner. We informally refer 
to this constraint as no news is good news, since it says that the surface cannot change radically without 
informing us of this fact by means of zero-crossings. 

In order to make explicit any constraints on the shape of the surface for locations in the image 
not associated with a zero-crossing, one must carefully examine the image formation process. Many 
factors are involved in the formation of image irradiances. As a consequence, changes in any of those 
factors can cause a change in the image irradiances, and hence a zero-crossing in the convolved image. 
For example, a change in surface material can correspond to a change in albedo, and hence to a zero- 
crossing in the convoked image; a discontinuity in depth can correspond to a change in the illumina- 
lion striking the surf ice, and hence to a zero-crossing; and a discontinuity in surface orientation can 
concspond to a change in the amount of illumination rellecied toward the viowvr, and hence to a 
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zero-crossing. We are interested in showing that the inverse is also true — in particular, that in regions 
in which the illumination and albedo are roughly constant, the absence of a zero-crossing implies that 
the surface shape cannot be changing in a radical manner. 

The basic result, corresponding to the intuitive argument of Figure 3, is that the probability of a 
zero-crossing occurring in regions where the illumination is roughly constant and the surface material 
does not change is a monotonic function of the variation in the orientation of the surface normal. 
(An analytic proof may be found in Grimson [1981c].) This provides a constraint on the possible 
surfaces that could be interpolated through a set of known points, and is referred to as the surface 
consistency constraint. It means that the probability of a zero-crossing increases as the variation in 
surface orientation increases. By inverting this argument, the best surface to fit the known data is that 
which minimizes the variation in surface orientation since this surface is most consistent with the zero- 
crossings in the convolved image. 

4. The Computational Problem 

We are now ready to consider the computational problem associated with the task of construct- 
ing complete surface specifications consistent with the information contained in the zero-crqssiogs. 
The modules of early visual processing, such as stereo or staicture-from-motion, provide explicit 
information about the. shapes of the surfaces at specific locations in the images, corresponding to the 
zero-crossings of the convolved images. The surface consistency constraint indicates implicit informa- 
tion about the shapes of the surfaces between the zero-crossings, stating that between known depth 
values, the surface cannot change in a radical manner, since such changes would usually give rise 
to additional zero-crossings. These two factors will now be combined, to obtain a complete surface 
specification. 

4.1 Using The Surface Consistency Constraint 

Suppose we are given a set of known depth points. We want a method for finding a surface* 
to fit through these points that is "most consistent" with the surface consistency constraint. We will 
find the most consistent surface in two ways. In the surface interpolation problem we construct a 
surface \hM exactly fits the set of known points. The problem can be relaxed somewhat into a suffice 
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approximation problem, by only requiring that the surface approximately fit the known data and be 
smooth in some sense. 

Given the initial boundary conditions of the known depth values along the zero-crossing con- 
tours, there is an infinite set of possible surfaces that fit through those points. We need to be able to 
compare members of this set of all possible surfaces fitting through those points, to determine which 
surface is more consistent. If we can do this, then the "most consistent" surface can be found by com- 
paring all possible surfaces. A traditional method for comparing surfaces is to assign a real number 
to each surface. Then, in order to compare the surfaces, one need only compare the corresponding 
real numbers. The assignment of real numbers to possible surfaces is accomplished by defining a 
functional, mapping the space of possible surfaces into the real numbers, ®:X h-> 3J. This functional 
should be such that the more consistent the surface, the smaller the real number assigned to it. In 
order to satisfy the surface consistency constraint, the functional should measure variation in surface 
orientation. In this case, the most consistent surface will be the surface that is minimal under the 
functional. (For further details and background information about the use of functionals is, see, for 
/"~> example, Rudin [1973].) 

The key mathematical difficulty is to guarantee the existence and uniqueness of a solution. In 
other words, we need to guarantee that there is at least one surface which minimizes the surface 
consistency constraint, and to guarantee that any other surface passing through the known points, for 
which the functional measure of surface consistency has the same value, is indistinguishable from the 
first surface. This issue is not just a mathematical nicety, however, but is essential to the solution 
of many computational problems. Suppose we devise an iterative algorithm to solve some problem. 
What happens if we cannot guarantee the existence of a solution? The iterative process could be set 
off to solve an equation and never converge to an answer — clearly an undesirable state. Further, 
suppose a solution exists but is not guaranteed to be unique. Then an iterative process might converge 
to one solution when started from one initial point, and converge to another solution when started 
from a different initial point. Although small variations in the different solutions might be acceptable, 
the solutions should not differ in a manner inconsistent with our intuition about the problem. Thus, in 
the case of visual surface interpolation, the real trick is to find a functional which accurately measures 
the variation in surface orientation, as well as guarantees the existence of a unique best surface (or a 
family of indistinguishable surfaces). 
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How can we guarantee the existence and uniqueness of a solution? In our particular case of 
surface interpolation, we will be using the calculus of variations to determine a system of equations 
which the most consistent surface must satisfy, by applying the calculus to the situation of fitting a 
thin plate through a set of known points. While this system of equations characterizes the minimal 
surface, it does not guarantee uniqueness. The form of the boundary conditions (the set of known 
points) will determine the size of the family of minimal surfaces. Unfortunately, determining the 
types of input for which a unique solution exists is generally very hard. Instead, we will exploit a 
general case of the mathematical existence of a solution with the weakest possible conditions on the 
functional. That is, we will determine a weak set of conditions on die functional that are needed to 
ensure that a unique most consistent surface, or at least a unique family of surfaces that are most 
consistent, will exist. We will show that if the functional is an inner-product on a Hilbert space of 
possible surfaces, then a unique most consistent surface will exist. (A Hilbert space is an extension of 
normal Euclidean space — basically a vector space in which a dot product operation exists, so that we 
can relate the vectors to the real line, and in which functions are usually used in place of the normal 
notion of vector.) 

In general, it is extremely difficult to find a functional that measures surface consistency and 
satisfies the conditions of an inner-product. Hence, we will show that if the functional is a semi-inner 
product on a semi-Hilbert space of possible surfaces, then the most consistent surface is unique up to 
possibly an element of the null space of the functional. The null space is simply die set of surfaces 
that cannot be distinguished by the functional from die surface which is zero everywhere. In this 
way, the family of most consistent surfaces can be found. Based on the font) of the null space, we 
can determine whether or not the differences in minimal surfaces are intuitively indistinguishable, and 
what conditions on the known boundary values will guarantee a unique minimal surface, from tliis 
family. 

Having derived conditions on the functional, we need to show that there is such a functional. 
The surface consistency constraint implies that the functional should measure variation in surface 
orientation over an area of the surface. Although the condition of a semi-inner product is a mathe- 
matical requirement needed to guarantee a solution, it does not restrict in an unreasonable way the 
kinds of surfaces we consider, and gives rise to at least two very natural functionals, both of which can 
be derived from the calculus uf variations: one measures the integral of the square I.aplacian applied 
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to the surface and the other measures the quadratic variation of the local x and y components of the 
surface orientation. Besides the mathematical justifications, we will also show practical and intuitive 
reasons in support of such functionals. 

Given that there are at least two possible functionals, are there others? We will show that if we 
require a functional that is: 

1. a monotonia function of the variation in surface orientation, 

2. a semi-inner product, and 

3. rotationally symmetric, 

then there is a vector space of possible functionals, spanned by the square Laplacian and the quadratic 
variation. In other words, there is a family of possible functionals, given by all linear combinations of 
these two basic functionals. 

Given that there is more than one possible functional, how do they differ? Using the calculus 
of variations, and some results from mathematical physics, we will show that the surfaces which mini- 
mize these functionals will be roughly identical in die interior of a region and will differ only along 
die boundaries of a region. As well, the null spaces of the functionals will differ, implying different 
families of most consistent surfaces corresponding to each functional. We know that the minimal 
surface is unique up to possibly an element of the null space. Since we require diat the solution 
surface cither be unique, or a member of an indistinguishable family of solutions, the size of the null 
space is important in judging the value of a functional. Based on this, we will argue that the quadratic 
\ariation is to be preferred over the square Laplacian. If we require that the surface pass through the 
known points, we can show that the form of die stereo data will force a unique most consistent surface 
for the case of quadratic variation, while this is unlikely for functionals such as the square Laplacian. 

Thus, we assert on mathematical grounds that the best functional is one which measures quad- 
ratic variation in surface orientation, and the most consistent surface to fit to the stereo data is that 
which passes through the known points and minimizes die quadratic variation. In a later section, 
wc will see examples of the types of minimal surfaces obtained under quadratic variation and the 
square Laplacian. It will be seen that the mathematical distinction in size of null space has a practical 
consequence, as the types of surfaces which minimi/e the square laplacian will be seen to be inconsis- 
tent with our intuitive notion of the best surface to fit to the known points, while the surface which 
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minimize the quadratic variation will be seen to be much more consistent with our intuitive notion of 
the best surface. 

4.1.1 The Problem is Well-Defined 

If surfaces are to be compared, by using a functional from the space of surfaces to the real 
numbers, with the purpose of finding die surface that best satisfies the surface consistency constraint, 
it is necessary to ensure that such a goal is attainable. What conditions on the form of the functional, 
or on the structure of the space of functions, are needed to guarantee the existence of such a "best" 
surface? One key constraint on the functional is given by the following theorem. The main point of 
the theorem is that in order to ensure that the problem is well-defined the functional should have the 
characteristics of a semi-norm. 

Theorem 1: Suppose there exists a complete semi-norm 6 on a space of functions H , and that ® 
satisfies the parallelogram law (for definition, see proof of theorem). Then, every nonempty closed convex 
setE CZH contains a unique element v of minimal norm, up to an element of the null space. Thus, the 
family of minimal functions is 

{v + s\ s£S} 

where 

s = { v - w | w e e} n jt 

andfK is the null space of the functional 

X = { u | &{u) = 0}. 

Proof: [See for example Rudin, 1973]. Any space with a semi-norm defined on it can be 
associated with an equivalent normed space. Let W be a subspace of a vector space//. For every 
v £ //, let w(v) be the cosct of W that contains v, 

n( v ) z=z {v + u: u& W}. 

These cosets are elements of a vector space H/W called the quotient space of// modulo W. In this 
space, addition is defined by 

n(v) -f- tt(w>) ■■■- ir(v -f- w) 
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and scalar multiplication is defined by 

an(v) = ir(av). 

The origin of the space H/W is 7r(0) — W. Thus, it is a linear map of H onto H/W with W as its 
null space. 

Now consider the semi-norm on the vector space H . Let 

X = {v: G{v) = 0}. 

This can easily be shown to be a subspace of H. Let 7r be the quotient map from H onto H/X, and 
define a mapping &:H /X h-» 5R, 

e'Mt;)) = e(v). 

If tt(i;) = *(u;) then ©(<;— u>) = 0. Since \Q{v) — 0(w)| < 0(v— «;), then © ; (7r(t;)) = e'(ff(u/)) 
and 0' is well defined on 11/ X. It is straightforward to show that 0' is a norm on JFf/JT. 
/**\ Now we can prove the statement of the theorem. The set E, a subset of//, can be transformed 

into a set J? 7 in the quotient space H/X while preserving the convexity and closure properties. 
The parallelogram law states 

[&{v + w)f + [0'(t; - w)f = 2[&{v)f + 2[&(w)] 2 . 

Let 

d = mf{&{v):vGE f }. 

Choose a sequence v n 6 E f such that Q f {v n ) h-* d. By the convexity of £', we know that \{v n + 
v m ) G E 1 ' and so [& f [v n -\- v m j\ > 4ri 2 . If t; and u> are replaced in the definition of the paral- 
lelogram law by v n and v 7m then the right hand side tends to 4d 2 .. Hut [&[v u + v™)] > 4d 2 , so one 
must have [0'(i> n — t> m )] h+ to preserve the equality. Thus, {v n } is a Cauchy sequence in H/X . 
Since the norm is complete, the sequence must converge to some v £ £", with 0'(t>) = d. 

To prove the uniqueness, if v t w G E f and ©'(v) = d, B f {w) = d then the sequence 
{?;, w, v t xo, . . .} must converge, as we just saw. Thus v ~- w and the clement is unique. 

We have proven that under the norm 0' on the quotient space II / X, the set E 1 has a unique 
?**'• minimal element. Hence, the structure of the quotient space implies that under the semi-norm on 
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'the space #, the set E' has a unique minimal element v, up to possibly an elenfiem of the mill space JT. 
In other words, the family of minimal elements is 

{v + s\ sGS} 

where 

S={v — w\ weE}Ci)(. 

I 

This theorem specifies one set of mathematical criteria needed to ensure that there exists a 
unique minimal element. Thus, if the surface consistency constraint could be specified by a ftinctional 
that satisfied the conditions of a complete semi-norm, obeying the parallelogram law, it might be 
possible to show that there is a unique coset of "most consistent" surfaces. We would really prefer to 
be guaranteed a unique surface, rather than some set of surfaces. One way to tighten the result of the 
theorem is to require that the functional is a norm. 

Corollary 1. 1: If® is a complete norm on a space offunctionsH, which satisfies the parallelogram 
law, then every nonempty closed convex setE C // contains a unique element v of minimal norm. 

Proof: If the functional is a norm, the null space is the trivial null space, and the result holds 
uniquely. g 

The theorem can be rephrased in terms of the surface interpolation problem as follows. 
Corollary L 2: Lei the set of known points be given by 

where the associated depth value isF { , Lei G J be a vector space of "possible" functions on 5ft 2 and let 

U^{fG ( if\f(x u y i )^F l *~1,.. ;1 JV} 

so that V is the set of functions that interpolate the known data : {Fj. let be a semi-norm, which 
measures the "consistency" of a function f G X, that is, we shall say that f is "better than g //©(/) < 
©((/)• If® is a complete semi-norm and satisfies the parallelogram law, then there exists a unkjm? (to 
within a function of the null space of O) function s £ V (hat is least meoOsistenf and interpolates (he 
data. II(?,< <' the interpolation problem is well-defined. 



/ *^"V 



17 
Proof: Clearly U is a convex set since for any f t g £ U, 

[\f + (1 - \)g]fa, Vi) = (X + 1 - m = K 

for any data point (x i} j/j). Furthermore, J7 is closed, since if / n £ U and / n h-> /, then f(x i} y 2 ) = F t - 
and / £ U. Then the previous corollary states that U has a unique (to within an element of the null 
space) element of minimal norm, which is exactly the desired "most consistent" surface. | 

This corollary is a translation of Theorem 1 into the problem of interest to us, finding the surface 
most consistent with the known data from the stereo algorithm. It specifies a set of conditions under 
which the interpolation problem is well-defined. Here, the notion of well-defined refers to finding 
a solution to the interpolation problem that is unique, and by unique we mean up to possibly an 
element of the null space of the semi-norm. As a consequence, the extent and structure of the null 
space of any semi-norm chosen to incoiporate the surface consistency constraint will be important 
in determining the utility of that semi-norm. In general, the smaller the null space, the tighter the 
constraint on die family of possible surfaces which can interpolate the known data. For example, if 
the possible variations in the least inconsistent surface were very large (due to the semi-norm having 
a large null space), then the utility of this semi-norm would have to be questioned. On the other 
hand, if the null space is small, and the resulting variations in possible least inconsistent surfaces were 
small, the semi-norm would have to be given serious consideration. We will see examples of possible 
functional and their null spaces in Section 4.1.4. 

Thus, Theorem 1 and Corollary 1.1 specify two different sets of sufficient, but not necessary, 
criteria for ensuring differing types of uniqueness. In both cases, the criteria applied directly to the 
structure of the functional. Of course, the real trick is to find a functional © which captures our 
intuition of variation in surface orientation and meets the requirements needed to guarantee a unique 
solution. 

4.1.2 The Space of Functions 

Theorem 1 describes a set of sufficient conditions for obtaining a unique family of minimal 
surfaces. The fundamental point is that we require a complete parallelogram semi-norm to ensure 
a unique solution. These conditions precisely define a semi-inner product, and hence the space of 
functions over which we seek a minimum must he a semi-I iilbert space. 
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Corollary 1.3: If% is a semi-Hitbert space of possible surfaces, and@(v) = n{v,v)% is an inner 
product semi-norm, where fi{v, v)% is the semi- inner product on the space %, then there exists a unique 
surface in SF (possibly to within an element of the null space of the semi-norm) thai mmtmim the semi- 
norm © over all surfaces. 

Proof: By the definition of Hilbert space, the semi-norm is complete. It is easy to show that it 
satisfies the parallelogram law from the definition of 0(i>) == fi(v y v) 1 ?. Thus, if the space of functions 
is a semi-Hilbert space, then, by Theorem 1, die interpolation problem is guaranteed to have a unique 
minimal solution, possibly to within an element of the null space, i 

Corollary 1.4: If SF is a Hilbert space of possible surfaces, and &{v) ** ii(v } v)i is an inner 
product norm, where n(v } i/)s is the inner product on the spaced, then there exists a unique surface in^ 
thai minimizes the norm over all surfaces. § 

4.1.3 The Form of the Functional 

The major problem is to determine die functional 0. The surface consistency constraint related 
the consistency of a surface to the variation in surface orientation. This forms die first constraint oil 
the functional. Theorem 1 states that if die functional is a complete, parallelogram, semi-norm, dien 
the problem has a well-defined solution. This forms the second constraint on die functional. Thus, 
if a functional can be found that is a complete, parallelogram semi-norm, and that is a monotonia 
function of die variation in surface orientation, then diis functional will constitute an acceptable 
measure of surface inconsistency. Any surface that is minimal under such a functional would dien be 
an acceptable reconstruction of the original surface in space. 

The problem may be considered in die following manner. With every point on the surface 
/, associate a pair of partial derivatives, f £ = p,f y = g, and hence, an orientation. Kach point 
on the surface may be mapped to a point in a space spanned by p and q axes, the gradient space 
[Huffman, 1971; Mackworth, 1973; Horn, 1977]. With each surface patch, one may then associate a 
neighborhood oFp-q space by mapping die p and q values associated with each point on the surface 
into gradient space. This neighborhood will be referred to as the p-q neighborhood" spanned by the 
surface patch. 

The surface consistency constraint implies that the probability of a zero-crossing occuring witJiin 
a patch of the surface is monotonically related to the size of the pq neighborhood spanned "by die 



/^•S 19 

surface patch. The surface consistency theorem [Grimson, 1981c] embodies a specific method for 
measuring this probability. Any functional that is monotonically related to the size of the p-q neigh- 
borhood will suffice, however. This is important, since it is also necessary that the functional be a 
complete, parallelogram semi-norm. Thus, the problem reduces to specifiying such a complete, paral- 
lelogram semi-norm, which monotonically measures the surface consistency constraint by measuring 
a monotonic function of the size of the p-q neighborhood spanned by each surface patch. To find 
the most consistent surface, one need only find the surface that minimizes this functional over each 
patch. Note that the surface will be "most consistent" only relative to the functional. There may be 
several functionals that are complete, parallelogram, semi-norms and that are monotonic functions of 
the variation in surface orientation. Each will give rise to slightly different minimal surfaces. 

Of course, there are some constraints on the minimization of the measure of the p-q neighbor- 
hood. For example, each surface patch cannot be minimized in isolation. To see this, consider a 
cylindrical (or one-dimensional) surface. Between any two adjacent zero-crossing points, the mini- 
mization of the variation in surface orientation (related to the size of the neighborhood in p-q space) 
/****s would result in a single point in gradient space, corresponding to a planar surface between the known 

depth values. The problem with this simple method of reducing surface inconsistency is that it does 
not account for the interaction of surface patches. In particular, such a method would result in 
a piecewise planar surface approximation. For any three consecutive zero-crossing points, such a 
method would introduce a discontinuity in surface orientation at the middle zero-crossing. This is 
unacceptable since the surface is required to be twice diffcrcntiable. Thus, there are some constraints 
on the manner in which the neighborhoods in p-q space arc minimized. 

We are thus faced with the following problem. We know from the boundary conditions 
provided by the stereo algorithm that the surface must pass through a set of known depth points 
located at the zero-crossings of the convolved images. Wc further know that at all other points in 
the image, the surface cannot change in such a manner as to cause an additional zero-crossing. With 
each surface portion, we can associate a measure of the probability of that surface implying a zero- 
crossing in the convolved image intensities. Since between zero-crossings, there are no other zero- 
crossings, this gives a measure of the inconsistency of this particular surface portion. To choose 
the least inconsistent surface, we must reduce this probability, as measured over all portions of the 
surface. 
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4.1.4 Possible Functional 

In tliis section, possible functionals © are considered, seeking complete, parallelogram semi^ 
inner products where possible, since this will guarantee that the solution is unique to within the null 
space. However, it is important to stress that there may be several viable alternatives. The computa- 
tional theory argued that the functional should measure the "consistency" of the surface. The attempt 
here is to define a measure based on this. The measure should be in a form that allows the constraints 
on the problem to be easily expressed. Also, die measure should be a semi-inner product on a semi- 
Hilbert space, in order to ensure a unique solution. We begin by considering several candidates in 
light of these requirements. 

Case 1: One Dimension 

For ease of discussion, the case of a cylindrical or one-dimensional surface will be considered 
first. By a cylindrical (or developable) surface we mean a second differentiate surface oriented along 
the y axis such thatdf/dy = q = c, for some constant c. 

Example 1.1: The variation in the normal to the curve is clearly related to its curvature. One 
could dius consider using a functional that directly measures curvature and integrates this measure 
over an area of the surface. To ensure that die functional is positive, tliis suggests using a functional of 
the form: 



•MM'-f/^ 




where k is the curvature of the curve at a point. (Recall that the subscript here implies a partial 
derivative, so that f xx = (d 2 f/dx 2 ) 2 .) Although tliis is perhaps the most "natural" definition of a 
functional, it is not a semi-norm, and hence it is considered to be unacceptable. To see why it is not a 
semi-norm, consider the following. If/ is in the space of surfaces, then 

r „2/2 
e,(a/) = { 




^ r,dx\ 



(1 + a>flf 



f"*\ 
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This condition will be true only if f x = 0. While this is certainly far too restrictive a condition to 
place on the possible surfaces, it does suggest a possible alternative. 

Example 1.2: A second choice is the quadratic variation of the gradient, which may be measured 

r f i» 

e 2 (/) = 



{//•-*} 



Note that it is a close approximation to the curvature of the curve, provided that the gradient f x is 
small. 6 2 is a semi-norm, so the surface that minimizes this norm will be unique to within an element 
of the null space of the semi-norm. The null space of 2 is the set of all linear functions: 

Jf = span{l,a:} 

where 

span{i?i, . . ., v m } = {a\Vi -f • • • + QmVm I «i» • • •» a ™ 6 &*}• 

Not only does this form of the functional satisfy the mathematical criteria of a complete, paral- 
lelogram semi-norm, it has a strong relationship to the "natural" form ©i(/), since the restriction of 
small f x is acceptable. Those cases in which f x is not negligible correspond to situations in which the 
surface is rapidly curving away from the viewer. These situations are such that the curvature of the 
surface will cause it to be invisible in one eye — giving rise to occluding boundaries. In general, we 
can assume that the image docs not consist solely of occluding boundaries, so that such occurrences 
will be rare in an image. Moreover, between such points, the surface will satisfy the restriction and the 
above semi-norm is well-suited to the interpolation problem. 

Case 2: Two Dimensions 

To each of the examples of the one-dimensional case, there is an analogous two-dimensional 

case. 

Example 2.1: As in the one-dimensional case, one possibility is to measure the curvature of the 
surface. The curvature of a surface is usually measured in one of two ways. 

For any point on the surface, consider the intersection of the surface with a plane containing 
the normal to the surface at that point. This intersection defines a curve, and the curvature of that 
curve can be measured as the arc-rate of rotation of its tangent. For any point, there are infinitely 
many normal sections, each defining a curve. As the normal section is rotated through 2tt radians, 
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all possible normal sections will be observed. There are two sections of particular interest, that which 
has the maximum curvature and that which has the minimum. It can be shown that the directions of 
the normal sections corresponding to these sections are orthogonal. These directions are the principal 
directions and the curvatures of the normal sections in these directions are the principal cwvatures, 
denoted n a and i%. It can be shown that the curvature of any other normal section is defined by the 
principal curvatures. 

There are two standard methods for describing the curvature of the surface, in terms of the 
principal curvatures. One is the first (or mean) curvature of the surface 

J = K a -f~ K^. 

The other is the second or Gaussian curvature of the surface 

K = K a • K b . 

For a surface defined by the vector {x, y, f(x, y)} t these curvatures are given by 
and 



r, JxxJyy J xy 



2* 



Thus, there are two possibilities for the functional. One is to measure the first (or mean) 
curvature of the surface, 

@3(/)H/ I J 2 dxdy\ 




(/„(! + fl) + fyy(l + fl) - VxfyUv) 



As in the onc-dimcnsional case, this is not a semi-norm, since 




I f f (/«(1 + «V? ; ) + / w (l + a 2 H) - 2af x af u f xy ) 
&,{af) = |o| / / ^ • - -3 -J-dxdy 



1* 

> 
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However, \ff x and f y are assumed to be small, then it is closely approximated by a semi-norm. In this 
case, consider 



e.i(/) = |//(v 2 /) 2 ^dy| 



This is a semi-norm, with null space consisting of all harmonic functions, 

A second possibility for reducing curvature is to reduce the second or Gaussian curvature, 

©s(/)H/ / K 2 dxdy\ 



•r 



By an argument similar to the above, it can be shown that this is not a semi-norm. Note that by using 
the above approximation of small f x and / y , we get the functional 

4 



e 6 (/) = Uf f, X fyy ~ f xy dxdyl 



We will return to this form later. 

Example 2.2: As in the one-dimensional case, one can also consider the quadratic variation. The 
quadratic variation in p = f x is given by 

/ J(pl + Pl)dxdy 

and the quadratic variation in q = f y is given by 

J f(<ll + q l)dxdy. 

[f the surface is twice continuously differentiable, then p y = q Xi and by combining these two varia- 
tions, one obtains the quadratic variation: 

e>7(/) = (/ / (/L + Vly + fly) dxdyl . 

Again, as in the one-dimensional case, this is a complete semi-norm that satisfies the parallelogram 
law. Hence, the space of interpolation functions has an element of minimal norm, which is unique up 
to an element of the null space, where the null space is the set of all linear functions: 

jf = span{l,s, y}. 

Duchon (I975, 1976) refers to the surfaces that minimize this expression as thin plate splines 
since the expression O7 relates to the energy in a thin plate forced to interpolate the data. 
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4.2 Where Do We Stand? 

We have seen that for the general surface interpolation problem, there are two constraints on the 
possible functional. One is that the functional must measure a monotonic function of the variation in 
surface orientation. The other is that the functional should satisfy the conditions of a complete paral- 
lelogram semi-norm, or equivalently, a semi-inner product. If the functional satisfies these conditions, 
then we know that there will be a unique family of surfaces that minimize this functional and hence 
form a family of best possible surfaces to fit through the known information. In the examples sketched 
above we saw that there are at least two possible candidates for tills functional, namely the square 
Laplacian, 

e 4 (/) 



{//(W)",}' 



and the quadratic variation, 

W)H/ I [fL+Vly+fyy 




)dxdy> 



There are several points still to consider. Are there other possible functionals? How do the mini- 
mal solutions to these functionals differ? What criteria can be applied to determine which functional is 
best suited to our surface interpolation problem? What is the best functional under 'those criteria? In 
the remainder of this chapter, we shall consider these questions in detail. The point we shall develop is 
that the appropriate functional to apply is the quadratic variation, and thus the surface that Mnimizes 
this functional is most consistent with the imaging information. 

4.3 Are There Other Junctionals? 

We have determined at least two functionals that meet our conditions. Are there other possible 
functionals, and if so, how do their minimal solutions differ from those of the square Laplaciaii arid 
the quadratic variation? 

To answer this question, we rely on a result of Brady and Horn [1981], which we sketch below. 
Recall that the basic conditions on the functional were that it measureNa monotonic function of the 
variation in surface orientation, and that it be a semi-inner product. The first requirement suggests 
thai iIk hmctional must involve terms that are functions of the second order partial derivath es of the 
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surface, since such terms will be related to the variation in surface orientation. The second require- 
ment is needed to ensure the uniqueness of the solution. Recall that //(/, g) is a semi-inner product 
if 

1. V>{f,9) = V>(9,f)> 

2. Kf + 9 >h) = ^(/, h) + fjt(g f h), 

3. v{af, g) = a/i,{f, g), 

4. Mt/,/)>0, 

and that given a semi-inner product /x(/, g\ one can define the desired functional by ©(/) = 

MA/)*. 

The difficult condition to satisfy is (3), which implies that the semi-inner product should not 
contain any constant terms. The conditions taken together imply that we should consider any quad- 
ratic form as a possible semi-inner product: 

M/> 9) = J J afxx9xx + Pfxy9xy + lfyy9yy + 

+ S(fxx9xy + fxy9xx) + t(fxx9yy + fyy9xx) + t(fxy9yy + fyy9xy)- 

Thus, the corresponding functional will have the quadratic form; 



*/>-//« 



•/L + 0/1, + til, + 2S i"i*« + 2e i"fv» + ifUvh 



yy 



The final condition we apply to the functional is that it be rotationally symmetric. This follows 
from the observation that if the input is rotated, the surface that fits the known data should not change 
in form, other than also being rotated. 

Minimizing the quadratic form of the functional ©(/) can be considered as finding the mini- 
mum over the integral of the function (A/) r MA/ where A/ is the vector: 



A/ 



and M is the symmetric matrix 



M 
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HOW DO THE FUNCTIONALS DIFFER? 

im is a rotation matrix, then the condition of rotational symmetry is given by 

(RAf) T M(RAf) = AfMAf. 
Vector algebra implies that we must have 

R T MR = M or R T M = MR~K 
Equating elements shows that the matrix M must have the form 

+ e e 
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M = 







§+ e 





There are two important consequences of this fact. The first is that the set of all possible func- 
tionals forms a vector space, since if Af, and M 2 satisify the conditions, then so does oM, + ^. 
The second is that this vector space of operators is spanned by the square Laplacian and the quadratic 
variation since: 
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The first term of the sum corresponds to the square Laplacian while the second corresponds to the 
quadratic variation. Thus, for e = 1 and p = 0, the functional reduces to square Laplacian. For 
e = and/? = 2, the functional reduces to quadratic variation. Finally, if e == \ and/3 = -1 
we obtain a functional which corresponds to the approximation to the integral of square Gaussian 
curvature derived in Section 4.1.4. 

Thus, we have answered our second question. There are other possible functional* but they are 
all linear combinations of the two basic functionals, the square Laplacian and the quadratic variation. 

4.4 How Do the Functionals Differ? 

Given that there are many possible functionals, all linear combinations of the square Laplacian, 
0-1, and the quadratic variation, 7 , we must consider how (he solutions to the square Laplacian and 
the quadratic variation differ. In other words, is there any noticeable difference in the surfaces that 
minimizes these two functionals, subject to fitting through the stereo data? To answer this question, 
wc shall rely on the Calculus of Variations, (see, for example, Courant and 1 lilbert [1953] and Forsyth 
fl%0|). lb.' salient points are outlined below. 
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4.4.1 Calculus of Variations 

The calculus of variations is frequently used to solve problems of mathematical physics, and is 
applicable to our surface interpolation problem. In particular, we can use the calculus of variations 
to formulate differential equations associated with problems of minimum energy. Suppose we are 
given a thin elastic plate, whose equilibrium position is a plane, and whose potential energy under 
deformation is given by an integral of the quadratic form in the principal curvatures of the plate. 
We can consider the interpolation problem as one of determining the surface formed by fitting a 
thin elastic plate over a region <& (with boundary T) and through the known points. Using a small 
deflection approximation, the potential energy is given by 

The solution to the interpolation problem is then the surface which has the minimum potential 
energy. 

The calculus of variations can be used to characterize this problem by providing a set of 
^^ differential equations (called the Euler equations) which the solution surface must satisfy. It can be 

shown (see Courant and Hilbert, [1953, p.251]) that the Euler equations for the interior of any region 
% are given by 

V / = fxxxx + *fxxyy + fyyyy = 0, 

except at the known points. Along the boundary contour T of the region, the solution surface must 
satisfy the equations (called the natural boundary conditions): 

M(f) = -V 2 / + (1 - ri(f xx z 2 8 + 2f xlJ x H y 3 + fyyyl) = 

P (f) = g~ V2 / + (1 - V) g~( IxxXnZs + fxy^nVs + XsVn) + fyyUnVs) = 0, 

where dfdn is a derivative normal to the boundary contour, d/ds is a derivative with respect to 
arclcngth along die boundary contour and x 3i y 8 and x n> y n are the direction cosines of the tangent 
vector and the outward normal respectively. The constant \i denotes the tension factor associated with 
the elastic material of the plate. 

There are two subcases of particular interest. In the first case, suppose that the tension factor is 
given by \i = 1. The energy equation reduces to 



//. 



{V'ffdxdy 
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which is simply the square Laplacian condition derived previously. The Eulej equation is die bihar- 
monic equation V 4 f = while the natural boundary conditions are 

V 2 / = 

along die boundary contour T. In the second case, suppose diat die tension factor is given by (i — Q. 
The energy equation reduces to 



ffjfl + 2fly+fly)d X dy 



which is simply the quadratic variation condition, also derived previously. The EuJer equation is 
identical to that of the square Laplacian, namely the bihamionic equation V 4 / == 0. The natural 
boundary conditions are different, however. They are given by 



d 
dn 



- V 2 / + [f xx x] + 2f xy x 3 y a + f yy yfj = 0, 
V2jf ~^~ ds\J xxXnXs + f*iA Xn V a + x *y») + fyyVnVs J = 0. 



In the simple case of a square boundary, oriented with respect to the coordinate axes, the boundary 
conditions reduce to: 

^Jxxy ~r Jyyy :===z " 

along the boundary segments parallel to the x axis, and 

Jxx =:= " 
Zjyyx ~T fxxx ' s==: 

along the boundary segments parallel to the y axis. 

These boundary conditions can be straightforwardly simplified to: 

/i/y = 

Jxxy ^^ " 

along the boundary segments parallel to the x axis, and 

Jyyx ~~ " 

atoiiii the boundary segments parallel to the y axis. 
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We have thus answered our question. For both the square Laplacian and the quadratic 
variation, the Euler equations are identical in the interior. The only difference to be noted in the 
extremal solutions to the two functionals will be observed at the boundaries of the surfaces. When we 
look at examples of solving these equations, this difference will become important. 

There is a second manner in which the minimal solutions to the functionals will differ, in part 
related to the difference in boundary conditions of the two solutions. While the form of the minimal 
surface under either functional is roughly the same, except at the boundaries, this minimal surface will 
be uniquely determined only to within an element of the null space of the functional. This will be an 
important factor in determining which functional is best suited to our problem, since we would like 
the boundary conditions provided by the stereo data to completely determine a unique solution. The 
null spaces of die two functionals differ greatly, since the null space of the quadratic variation is the 
space of all linear functions, while the null space of the square Laplacian is the much larger space of 
all harmonic functions. We shall consider the effect of this difference later. 

4.5 The Best Functional 

Given that the set of possible functionals forms a vector space spanned by the square Laplacian 
and the quadratic variation, what criteria can be applied to determine the best functional? Since the 
Euler equation for both of these basis operators is the biharmonic equation V 4 / = 0, the same will 
be true of any other operator in the vector space. This implies that aside from boundary conditions at 
the edge of die region being interpolated, die surfaces provided by any operator in this space will be 
basically the same. 

This being the case, the only other characteristic that can distinguish between the possible func- 
tional is the size of their respective null spaces. Let us denote the null space of the square Laplacian 
by Jf] (the space of all harmonic functions) and the null space of die quadratic variation by M 2 (the 
space of all linear functions). Note that X2 is a subspacc of Jfj. Now die null space for any linear 
combination of diese two operators must contain at least the space spanned by the intersection of the 
two null spaces Jfi and J^. Hence, the null space of any other operator must consist at least of the 
linear functions. Thus, no possible operator can have a null space smaller than that corresponding to 
quadratic variation. 
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The importance of the null space is that it helps determine the family of surfaces that ar<f mini- 
mal under the functional. The requirement we impose on the best functional is that the member of 
this family corresponding to the minimal surface be uniquely determined, when combined with the 
requirement that the surface must pass through the known points provided by the stereo algorithm. 
Clearly the smaller the size of the null space, die fewer the requirements we must impose on the 
output of the stereo algorithm in order to insure a unique solution. 

We may view this criterion in the following manner. Initially, we start with the space of all pos- 
sible functions, namely, die space of all second differentiate functions of two real variables, denoted 
C 2 (?R 2 ). If we restrict our attention to those surfaces that pass through the boundary conditions 
imposed by the stereo or structure-from-motion data, we define a convex subset U of this space. If we 
define a functional on this space, the set of surfaces that are minimal under the ftinctional are given by 

{v + w | w & W) 

where 

w = {v — u\ ueu\ntf, 

for some minimal surface v G U. We are guaranteed a unique solution to the interpolation problem 
if W is empty (or equivalent^, consists only of the null surface, defined to be zero everywhere). 
The key question becomes: can we have two surfaces that fit through the known points, have the 
same measure of surface consistency (the same value as measured by the functional) and differ by an 
element of die null space? If not, we arc done, as the minimal surface is then guaranteed to be unique. 
Thus, the structure of die boundary conditions provided by the stereo algorithm (or the structure- 
from-motion algorithm) may be important in deciding which amctional is more suitable. Clearly, the 
smaller die subspace of minimal surfaces, the more likely we are to have a unique minimal surface 
fitting the known data, as the set W is more likely to be empty. 
Recall diat the null space of die square Laplacian 

e(/) 
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is the set of all harmonic functions. We wish to know what form of the boundary conditions will 
uniquely determine the harmonic function. "This problem is known as the Diridilct problem in 
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classical analysis, and it has long been known that if die boundary conditions consist of a series of 
closed, bounded Jordan curves, then the harmonic function is uniquely determined. These are, of 
course, sufficient, but not necessary conditions. It would appear, however, from these conditions 
that it is unlikely that the boundary conditions provided by the stereo algorithm will be sufficient to 
uniquely determine the component of the null space. This follows from the observation that the stereo 
algorithm is capable of providing boundary values at scattered points in the image, corresponding to 
the zero-crossings of the convolved image, while the Dirichlet problem is uniquely determined if the 
boundary values form a closed, bounded Jordan curve. Thus, in the case of the square Laplacian, the 
best we can do is determine a family of most consistent surfaces, which differ by harmonic functions. 
Referring back to our earlier question, we see that in this case, we could have two (or more) surfaces 
which fit through the known points, have the same measure of surface consistency, and differ by an 
clement of the null space. The variation in such a family of surfaces is not consistent with our intuitive 
notion of indistinguishable surfaces, that is, the difference in die shape of two surfaces which have 
identical minimal values for die square Laplacian measured over the surface can be noticably large. 
r^- As a consequence, we consider the square Laplacian to be a poor choice for the functional. 

On the other hand, the null space of the quadratic variation 

i 
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is the set of all linear functions. The boundary conditions required in this case to uniquely determine 
the component of the null space are much simpler. In particular, if the stereo algoridim provides at 
least three non-colincar points, the element of the null space is uniquely determined to be the null 
surface (the surface which is zero everywhere). It is clear that in almost all imaging situations, the 
stereo algorithm will be capable of providing the necessary boundary conditions, and thus the most 
consistent surface is uniquely determined. 

Thus, we have seen that the only possible functionals that can be used to measure the surface 
consistency constraint form a vector space spanned by die square Laplacian operator and the quad- 
ratic variation operator. The minimal surface for any such operator satisfies the biharmonic equation 
in the interior of the region being interpolated, but along the boundaries of the region it may satisfy 
different differential equations than die minimal solution of any other operator. In general, this 
implies (hat the solution surfaces corresponding to different operators will generally differ in shape 



THE COMPUTATIONAL PROBLEM 32 

only near the boundaries... To distinguish between possible operators, we examined. the fonn of thpir 
null spaces. We showed that the operator with the smallest null space was the quadratic variation. 
.Further, the stereo data is in general sufficient to uniquely determine the component of the null space 
corresponding to the minimal surface. That is, the surface that minimizes the quadratic variation, sub- 
ject to passing through the known points provided by the stereo or structure-from-mption algorithms, 
is uniquely determined. 

4.6 The Computational Problem 

By combining the results of the last two chapters, it is now possible to state the computational 
theory of the problem of interpolating visual surface infcro^ticm. 

The Interpolation of Visual Information: Suppose one is given a representation consisting of 
surface information at the zero- crossings of a Primal Sketch description of a scene (this could be either 
from the Marr-Poggio stereo algorithm, or from the Ullman structure-from-motion algorithm, or both). 
Within the context of the visual information available, the best approximation to the original surface in 
the scene is given by the minimal solution to the quadratic variation in gradient (or surface orientation) 
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Such approximations are guaranteed to be uniquely "best" to within an element of the null space of the 
functional ©. In the case of quadratic variation, the null space is the set of all linear functions. Provided 
the set of known points supplied by the stereo algorithm or by the structure from motion algorithm 
includes at least three non-colinear points, the component of the surface due to the null space is uniquely 
determined to be the null surface Hence, the surface most consistent with the visual information is 
uniquely determined. 

It is worth noting that although the above statement is phrased in terms of zero-crossings ob- 
tained from images convolved with V 2 G filters, the heart of the statement is much broader in scope. 
The key point is that to interpolate any surface representation which contains explicit infonnation 
only at sparse points in the representation, we need to find the "rnost conservative" surface consistent 
with the input information, '['his implies that between the known sin (ace points, the surface should 
vary a> Intle as possible. Thus, whether those known points correspond to zero-crossings, ed^es, or 
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some other basic descriptor of image changes, the surface interpolation algorithm should construct the 
surface which minimizes variation in the surface between the known points. 

It is interesting to compare the criteria for surface interpolation developed here, as well as the 
specific theory of surface interpolation stated above with the work of Barrow and Tenenbaum, [1981]. 

5. Constrained Optimization 

Our goal throughout this paper has been to find the surface that best fits the known data 
provided by the stereo algorithm or the structure-from-motion algorithm. In the preceeding sections, 
we saw that such a "best" surface exists and is characterized as die surface that minimizes die func- 
tional of quadratic variation, measured over the surface. The problem we address now is how to find 
that minimal surface. What is meant by "finding the minimal surface"? Our goal is to derive an 
algorithm that computes explicit surface values (such as depth, or relative depth) at all points on a 
discrete grid, m points on a side. (That is, the scene will be partitioned into an m X m grid, and to 
each grid point, we want to associated a surface value.) 

In general terms, we are seeking an algorithm to solve an optimization problem — we want to 
compute the values of a set of parameters that optimize some function. In our case, the parameters 
correspond to the surface values at the grid points, and the function to be optimized is die measure of 
a discrete correlate to quadratic variation over the surface. We will restrict our attention to the class of 
optimization algorithms that satisfy three simple criteria of biological feasibility — parallelism, local- 
support, and uniformity. These three criteria, together with the form of the input data — scattered 
contours of known points — preclude many of the possible techniques for solving an optimization 
problem, but also suggest the use of techniques such as those of mathematical programming. 

5.1 The Role of Algorithmic Criteria 

An essential problem for any computational theory about early visual processing is to determine 

the implicit assumptions used by the visual system to perform the computation. These are valid 

assumptions about die environment that arc explicitly incorporated into the computation. Ullinan's 

rigidity assumption in visual motion perception [Ullman, 1979a], Man* and I iildrcth's condition of 

./ - ** v linear sanation and spatial coincidence assumption jMarr and I lildreth, 1980], and Marr and PogguVs 
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assumptions of uniqueness and continuity [Marr and Poggio, 1979] arc three examples. Such assump- 
tions may be considered as computational constraints on the problem. 

There is a second set of criteria that may be applied to any theory and, more importantly, to aiiy 
algorithm for a theory. They deal with the requirement of biological feasibility, and are important if 
one is to describe a model of the human system. They will be termed algorithmic criteria. Ullman 
[1979b] has listed a number of such criteria that should apply to any biologically feasible algorithm. A 
similar set is briefly sketched here. 

Parallelism 

The need to process large amounts of input data in short amounts of time implies the use of 
computations that can be implemented in a parallel manner, using a large number of intercon- 
nected processors. 

Local Support 

If the number of processors involved in the computation is large, it becomes infeasible to con- 
nect each one to all of the others. Rather, there should only be local connections between the 
processors. Here, "local" means not only that the number of connections be small, but also that 
since the information being processed has a two-dimensional plane as an underlying coordinate 
system, the connections should also be local in a spatial sense. If die support of a function, 
defined on a two-dimensional grid, is the set of points on the grid that contribute in a non- 
trivial manner to the computation of the function, then our requirement is that the processors 
implementing our computation must have local support. 

Uniformity 

One final consideration, though not as critical as the first two, concerns the uniformity of the 
processors. If it is possible, an algorithm that utilizes parallel networks of identical processors 
will be favored over other algorithms. Such a requirement is not crucial, however. 

Although the original motivation for such restrictions on an algorithm arises from consideration 
of the human visual system and restrictions of biological feasibility, they could apply equally well to 
other types of image processing systems. As such, they are taken as general criteria- for the computa- 
tions to be investigated, regardless of whether the algorithm serves as a model of the human system. 
In designing algorithms to solve a particular visual process, the first step is to seek a method that 
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solves the problem. Having done so, one can then consider its applicability in light of the criteria 
outlined above, and possible modifications to the algorithm in order to satisfy those criteria. 

5.2 Mathematical Programming 

The surface interpolation problem, as we have developed it, can be viewed as an optimization 
problem; that is, the solution to the surface interpolation problem is equivalent to the minimal point 
of an objective hypersurface. There is a large body of literature on methods for finding the solution 
to optimization problems in general. In considering which one to apply to our problem, we take 
two factors into account. The first is the form of the input data supplied by stereo (and possibly 
also structure-from-motion). The key point is that the set of known points will generally consist of 
a series of zero-crossing contours, along which the depth is known. These contours are not closed, 
since the horizontal components will have no disparity value, and hence no depth value, assigned to 
them. Further, they tend to be scattered at random rather than distributed uniformly over the grid. 
(This removes many methods from further consideration.) The second factor is the architecture of the 
possible algorithm, outlined by die algorithmic criteria of die previous section. As a consequence of 
diesc two factors, many of die possible methods, while perfectly valid solutions mathematically, are 
not readily applicable to our problem. A comprehensive review of the types of methods may be found 
in Schumakcr [1976] (see also Grimson [1980, 1981b]). 

Given that an algorithm used to solve the visual surface interpolation problem must be local, 
parallel and uniform, and must be capable of dealing with scattered input data, one of the best 
methods to use is mathematical programming, and in particular, nonlinear programming. Ullman 
[1979b] has shown that many problems of relaxation and constrained optimization can be solved by 
local nonlinear programming processes (see also Hummel and Zucker [1980]). Indeed, a method 
similar to that outlined here was used by Ullman in solving die motion correspondence problem 

[Ullman, 1979a]. 

Recall that the problem with which we are faced is to find the surface that minimizes a func- 
tional measuring surface consistency. The most likely candidate for this functional is the quadratic 
variation. The boundary conditions with which the surface must agree are depth values along 
the zero-crossings, given either by the Marr-Poggio stereo algorithm or the Ullman structure- from- 
moiion algorithm. These boundary conditions can be met in one of two ways. If die surface is 
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required to fit exactly through the boundary points, the problem is one of surface interpolation. If 
the surface is required only to pass near the known points, while minimizing some error function, the 
problem is one of surface approximation. In the following sections, both problems will be examined. 

Two cases of optimization will be examined: unconstrained optimization, which is applicable 
to the approximation problem, and constrained optimization, which is applicable to the interpolation, 
problem. Standard algorithms for computing the solution to the optimization problem for each case 
are sketched below. For more details on mathematical programming, see for example Luenberger, 
[1973]. 
The Conjugate Gradient Algorithm 

Starting at any point x G E n define do = —go = b — Qx and 

x fc+ i = x fc + a k &k 

dfc+i == — gfc+i + Acdfc 

a — Bfc+iQ dfc 
ft " dlQd, 

...wheregfc = Qxfc — b.| 

We will apply this algorithm, for the case of unconstrained optimization, to the problem of 
visual surface interpolation in the next section. Because the algorithm is solving an unconstrained 
optimization problem, it will be applicable to die surface approximation problem, where the surface is 
required to pass near, but not necessarily through, the known points. 
Gradient Projection Algorithm 

The algorithm may be summarized as follows. 

1. Find the subspace of active constraints M, and form the matrix A P . 



I - aJ(a p aJ) 1 a 



2. Calculate the projection matrix P& = 

-P*V/(x) r ,' 

.3. If d 5^ 0, find the scalar c\ that maximizes 

{a: x + ad is feasible} 

and the scalar c 2 that minimizes 

{/(x + ad): 0<a<c,} 



and the direction vector d — 
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as a function of a. Set x to x + c 2 d and return to (1). 

4. If d = 0, find = -(A p A^)~ 1 A p V/(x) r . If fy > 0, for all j corresponding to active 
inequalities, stop, as x satisfies die Kuhn-Tucker conditions. Otherwise, delete the row from A p 
corresponding to the inequality with the most negative component of/? and return to (2). | 

We will apply this algorithm, for the case of constrained optimization, to the problem of visual 
surface interpolation in the next section. Because the algorithm is solving a constrained optimization 
problem, it will be applicable to the surface interpolation problem, where the surface is required to 
pass through the known points. 

6. The Interpolation Algorithm 

The algorithms of the previous section can now be applied to the problem at hand, the inter- 
polation (or approximation) of visual surfaces from the stereo data. Recall that the interpolation 
problem was stated as: 

The Interpolation of Visual Information: Suppose one is given a representation consisting of 

surface information at the zero-crossings of a Primal Sketch description of a scene (this could be either 

from the Marr-Poggio stereo algorithm, or from the Ullman structure-from-motion algorithm, or both). 

Within the context of the visual information available, the best approximation to the original surface in 

the scene is given by the minimal solution to the quadratic variation in gradient 

6(S) = {/ / ( 5 « + 2 ^ + S ^ dxdy } ' 

(where s denotes a surface). Such approximations are guaranteed to be uniquely "best" to within an 
element of the null space of the functional 0. In the case of quadratic variation, the null space is the 
set of all linear functions. Provided the set of known points supplied by the stereo algorithm or by 
the structure-from-motion algorithm includes at least three non-colinear points, the component of the 
surface due to the null space is uniquely determined to be the null surface. Hence, the surface most 
consistent with the visual information is uniquely determined. 

We shall consider solving this optimization problem both in the case of interpolation (the sur- 
face passes exactly through the data) and in the case of approximation (the surface passes near the 
data). Although the algorithms could be either applied to the square 1 aplacian or to the quadratic 
variation, we shall examine only the case of the quadiatic variation. 
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6.1 Conversion to the Image Domain 

The problem, as stated, lies clearly within the domain of continuous functions. Yet this is 
not appropriate to the case at hand. In order to establish an algorithm for transforming the visual 
information into a representation of the surface shape, a number of conversions must take place. 

The first point to note is diat the functional 6(s) consists of a square root. (Note diat we will use 
s to denote die surface which we are fitting to die known points, to distinguish it from the notation 
of s used in the previous chapter to denote the objective function.) However, clearly any function 
which minimizes the functional 0(s) also minimizes die functional G 2 (s), and vice versa, provided 
diat the functional is always positive in value. Hence, without loss of generality, one may consider the 
minimization of 

@(*) = //(4. + 24 y + 4)^- 

Throughout this section, this will be die functional to be minimized. 

In order to determine the structure of the algorithm, one must address the issue of die form of 
die output representation, since that will have a major effect on the actual algorithm. In this case, it 
is desired that the surface information be specified only at particular places within the representation 
of die scene. This will be accomplished by requiring diat die interpolation algorithm compute explicit 
depdi values at all locations within a Cartesian grid of uniform spacing. Although both the spatial 
resolution of die grid and the resolution of the depth information stored within that grid should be 
determined, it is considered that such parameters are not critical to die development of die algorithm. 
Hence, these parameters will be assigned arbitrary values. 

The continuous functional must now be converted to a form applicable to a discrete grid. 
Without loss of generality, assume diat die grid is of size m X m. Each point on the grid may be repre- 
sented by its coordinate location, so diat the point (i, j) corresponds to die grid point lying on the i th 
row and the j th column. At each point (i t j) on die grid, a surface value may be represented by s {iJ y 
Bach such surface value may be considered as an independent variable, subject to the constraints 
of the problem, of course. Using eidier row major order or column major order, these variables 
8( it j) may be considered as a vector of variables, denoted s = {%o),S(b,i)> • ■ .,% m — i),(m -i))}« 
(For clarity, a straightforward transformation from the doubly-indexed grid coordinates into a singly- 
indexed vector coordinate can be established. For example, the grid point (i, j) can be mapped 
to the \cci.or point k ~- mi -f j, and the vector point A: can be mapped 10 the grid point (*', j) - 
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([jfc/mj, Jfe — m|fc/mj).) It is this vector which will be modified using the non-linear programming 
algorithms, and the final value of which will form the solution to the optimization problem and thus 
correspond to the desired interpolated surface. 

Having converted the surface function to a discrete grid format, it is now necessary to convert 
the objective function of the optimization problem to a discrete format. This means that the 
differential operators must be converted to difference operators. There are many possible dis- 
crete approximations to the differential operators. We choose to use the following approximations 
[Abramowitz and Stegun, 1965, p. 884]. 

The second partial derivative in the x direction may be approximated by 

where h is the grid spacing, and 0{h 2 ) indicates that the approximation is valid to terms of order h 2 . 
Similarly, the second partial derivative in the y direction may be approximated by 



/—n & 2s {i,j) _ 1 



foij-H) - 2a ii.i) + M-i-i)} H" °( h2 )- 



dy 2 h 2 
The cross second partial derivative can be approximated by 



d% 



tr = ifon-u-H) - *.-+i,i-o - 5 (;+u-i) + ^.*— i,i— o] + °^ 2 )- 



dxdy 4h 2 ' 

Note that such approximations have frequently been used in the image processing literature, (for 
example, see the reviews of Davis [1975], Rosenfcld and Kak [1976], Pratt [1978]). Little is known of 
the affect of these approximations on the behavior of the result. 

Having converted the surface function and the differential operators, one must convert the 
double integral to a discrete equivalent. This can easily be done, by using a double summation 
over the finite difference operators applied to the discrete grid. One minor point is noted. While 
it is straightforward to form the discrete equivalent to the double integrals / / s 2 xx dxdy and 
/ / ^jydxdy, the cross term 2 / / s 2 xy dxdy is handled differently. In particular, consider a second 
grid, superimposed on the first, which has twice the spatial resolution of the first (that is, all integer 
points arc represented as are all points ($,£)). For the cross term, we shall apply the finite difference 
^""^ operator to all half integral points on tins finer grid. The combination of these operators yields the 
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discrete objective function: 

ra — 2 m — 1 • \2 

minimize ]T] ]T] ( S (i-U) — 2s (i,j) + 5 (H-U)J 

m — 1 m — 2 • \2 

+ Xj X; (^'ii--i) — 2 ^') + ^*»i+o) 

m — 2 m — 2 • \2 

+ 2 Xj 2 l^i) — ^*+ 1 »i) — ^li+O + ^+ij+i)) • 
2=== o j~0 v ' 

Finally, the characterization of the constraints must be considered. The case of interpolation 
will be considered first, where the interpolated surface is required to pass through the known points. 
Let 3 = {{%, j) | there is a known depth value at the grid point (i, j)} be the set of grid points 
for which a depth value is known. Then the constraints on the optimization problem have the form 
S(i,i) — £(i,j) — for all points (i, j) in the set 3\ and where the c^j^s are a set of constants reflecting 
the stereo data. Note that the set of constraints are all equality constraints. 

6.2 The Gradient Projection Interpolation Algorithm 

It is now possible to consider applying the gradient projection method to this problem: 



m — 2 m — 1 • \2 

X] Xj ( S ('~U) — 2s (iJ) + fy+U) ) 



m— 2 ro— 1 / \2 

minimize 



— 1 m— 2 



m — i m — & s \ri 

+ E E (^.i-i)-* 2 ^.j)+^«j+i)) 

m — 2 m — 2 • \ 2 

+ 2 Xj X] (^i)~^+i^ — ^M+i) + ^t+i,;+i)l • 

subject to s (l - ^ — c (i j) == 0, V(i, i) G 3\ 

To apply the method of gradient projection, it is necessary to determine the set of active con- 
straints, and the projection matrix onto the subspaee spanned by the active constraints. Clearly, since 
all ilk. Cun:.liai'nt.s are equality constraints, they arc all active at c\ery iteration. Thus, the maliix \ !> 
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(where p = \$\) has rows consisting of a 1 in the position corresponding to the grid point (t, j) for 
(ij) e f and O's elsewhere. One can easily show that A p Aj = I and that A^A P = 6 S where fc is 
a matrix consisting on O's except for those rows corresponding to a point in !f , such rows containing a 
1 for the diagonal element. Thus, the projection matrix P = I — 5? consists of all O's except for the 
diagonal elements in those rows corresponding to a grid point not in 3\ such elements being 1. The 
effect of the projection matrix P is to ignore any components of the direction vector d corresponding 
to a known point, while preserving all other components, unaltered. 

Recall that the direction vector d is determined by the projection of the negative gradient of 
the objective function. By expanding the double summation and performing the differentiation, the 
components of the gradient of the objective function are given, in this case, by the following: 
For all elements in the center of the grid, apply die following stencil to the grid representation of the 

surface function s: 

2 

4 -16 4 

2 _16 40 —16 2 

4 -16 4 

2 

By this, we mean that given a two-dimensional grid representation of the current surface approxima- 
tion, s, the value of the component of the gradient of die objective surface at some point (ij) on the 
grid is obtained by applying the above stencil centered over that point (t, j), multiplying the value of 
each of the stencil points with the value of the surface at that point and summing these products. The 
value of the components of the gradient can be computed in this manner by applying die stencil to all 
points in the center of the grid. 

Along the outer edges of the grid, the above stencil docs not apply. Instead, a careful expansion of the 
gradient of the objective function shows that the following stencils should be used. 
For elements in the corners of the grid, apply the following stencil (or its appropriate rotations and 
reflections) to the grid representation of the surface function s: 

2 
-8 4 

8-8 2 
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For elements along an outside row of the grid, one point removed from th<? corner, apply the fol- 
lowing stencil (or its appropriate rotations and reflections) to the grid representation of the surface 

function s: 

2 

4 -12 4 

.—8 20 —12 2. 

For elements along an outside row of the grid, more than one point removed from any corner, apply 

the following stencil (or its appropriate rotations and reflections) to the grid representation of the 

surface function s: 

2 

4 -12 4 

.2 —12 22 —12 2. 

For elements along a row second from the outside of the grid, located one element from each of 
two outside rows, apply the following stencil (or its appropriate rotations and reflections) to the grid 
representation of die surface function s: 

2 
4 -16 4 

-12 36 —16 2 
4 -12 4 

For all other elements along a row second from the outside of the grid, apply the following stencil (or 
its appropriate rotations and reflections) to the grid representation of die surface function s: 

2 
4 -16 4 

2 —16 38 -16 2 
4 -12 4 

Thus, die direction vector has zero valued components at all points corresponding to known 
depth values, and non-zero valued components elsewhere, with value given by the result of convolv- 
ing the above stencils with the current surface approximation. It is interesting to note that the stencil 
used in the interior of the grid is a finite difference approximation lo the inharmonic equation y 's •-■= 
jAbiiiiinn'. ii, and Stegun, 1965. p.885|. This should not be surprising, since the Filler equation, 
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derived previously from the calculus of variations, was precisely this equation. Thus, we see that the 
quadratic programming algorithms implicitly solve the Euler equation. 

We have determined the form of the direction vector, which specifies the direction in which to 
move in order to reduce the objective function and refine the surface approximation. To determine 
the amount to move in this direction, it is necessary to determine the minimum value of the objective 
function along this direction, that is, to determine the value of a such that 



m — 2 m — 1 s 

+ad(i-i }j) — 2ad (l)i) + ad {i+ifj) J 

m — 1 m — 2 • 

+ ad (f,j— 1) — 2Qd (i,i) + ad {iJ+l) J 
m — 2 m — 2 • 

+ 2 X] I] ( %W ~ ^*'+ 1 ^) — ^*^+ 1 ) + ^+i,i+i) 

+ad {i}j) — ad( i+ x tj ) — ad(i,j+i) + ad (*-fi,i-H)J 



is minimized. A straightforward application of calculus determines that this value for a is given by the 
^ m ^ K -- ratio of a -- " l where 
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and 



i — 2 m — 1 



m — l m — I • \ 

m — 1 m — 2 • \ 

+ XI XI (^,i-i) — 2 ^J)+^i+i)) 

m — 2 m — 2 • \ 

+ 2 2 XI ( ^ ~ 5 (^+Li) - ^-,i+i) + s (*-H ,i+i) ) 

[ d (hj) — d (**-H,i) ~~ d (M-H) + rf (2+i,i4-i) J 



n — 2 m — 1 • \ 2 

X X ( rf (^-Li)'~ 2d (^i) + d 04-Li)) 



m — 2 m — 1 

a 2 = 
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Thus, the algorithm is completely determined. The steps consist of: 

0. Determine a feasible initial surface approximation (any surface approximation which contains the 
known stereo depth values c ( , j) at the known points (i, j) £ l S will suffice). 

1. Compute the gradient of the objective function by convolving the grid representation of the cur- 
rent surface approximation with the stencils listed above. Compute the direction vector by taking the 
negative of the gradient, setting any components corresponding to known depth points to zero. 

2. Compute the scalar a which specifies the amount to move along the direction vector on the 
hypcrsurface defined by the objective function, by the formula given above. 

3. Refine the surface approximation by incrementing the current surface approximation with the 
scaled Jiki lion vector. 
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4. Return to step (1) and continue until the magnitudes of all components of the direction vector are 
smaller than some constant e. 

6.3 Examples of Interpolation 

We can demonstrate the effectiveness of the surface interpolation algorithm by considering the 
performance of the gradient projection algorithm on several examples. Although the previous discus- 
sion dealt specifically with applying the gradient projection algorithm to the quadratic variation, a 
similar analysis can be performed for other functionals such as the square Laplacian. (Recall that any 
feasible functional was a linear combination of these two functionals.) 

To demonstrate both the effectiveness of the interpolation algorithm and the difference between 
the quadratic variation and the square Laplacian, we consider three synthetic examples in Figures 4- 
6. In Figure 4, the interpolation algorithm is given as boundary conditions a set of closed contours 
from a cylinder, oriented parallel to the z-axis. It can be seen that the surfaces obtained by minimizing 
the square Laplacian and the quadratic variation differ markedly along the edge of the region. This 
is to be expected for two reasons. In Section 5, we derived the Euler equations for the interpolation 
problem, a set of differential equations which must be satisfied by the minimal surface. The Euler 
equations for the interior of a region were identical for both the square Laplacian and the quadratic 
variation, namely the biharmonic equation. Along the edges of the region, however, the natural 
boundary conditions imposed different equations on the solution surface. This fact is reflected in 
Figure 4. The second reason for the different surfaces arises from the form of the stencils obtained 
in Section 6.2. The stencils to be applied at the edges of a region in the case of quadratic variation 
arc numerically more stable than those to be applied in the case of the square Laplacian. (This may, 
in fact, simply be a reflection of the difference in Euler equations.) In cither case, it can be seen 
from Figure 4 that while minimizing the quadratic variation results in a reasonable approximation to a 
cylinder, minimizing the square Laplacian yields less acceptable results. 

In Figure 5, we illustrate a second synthetic example. In this case, the boundary conditions are 
points lying on a hyperbolic paraboloid, chooscn at random so that the known points do not form 
closed contours. As in the previous case, it can be seen that while the surfaces obtained by minimizing 
the two functionals are very similar in the interior of the region, the surfaces differ drastically along 
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Figure 4. Synthetic Example. The top figure shows a synthetic set of boundary conditions, 
consistent with a cylinder aligned with the axes of* the grid. The middle figure shows ''the surface 
obtained by applying the gradient projection algorithm to the square Laplacian functional. The 
bottom figure shows the surface obtained by applying the algorithm to' The quadratic variation. 
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Figure 5. Synthetic Example. The top figure shows a synthetic set of boundary conditions, consistent 
with a hvperbolic paraboloid. The points are chosen at random with a density of 10 percent. 
The middle figure shows the surface obtained by applying die gradient projection algorithm to 
the square Laplacian functional. The bottom figure shows the surface obtained by applying the 
algorithm to the quadratic variation. 6 
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Figure 6. Synthetic Example. The top figure shows n synthetic set of boundary conditions, 
consistent, with a cylinder not aligned with the a>,es of the grid. The middle figure shows the 
surface obtained by applying the gradient projection algorithm to (he : .qua re I aplacian functional, 
The hodoni figure shows the surface obtained by applying the algorithm to the quadratic variation. 
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the edges of the region. Again, minimizing the quadratic variation yields a reasonable approximation 
to a hyperbolic paraboloid. 

In Figure 6, we illustrate a third synthetic example. In this case, the boundary conditions are 
again taken from a cylinder, here oriented at 45 degrees to the s-axis. As in the previous cylinder 
example, the major difference between the two surfaces occurs along the borders of the region and the 
minimization of quadratic variation yields a good approximation to the cylinder. 

The interpolation algorithm was developed to account for die creation of complete surface 
descriptions from the sparse surface information provided by visual modules such as stereo. We can 
demonstrate the effectiveness of the interpolation theory by applying the algorithm to different stereo 
examples. In Figures 7-10, we illustrate the results of applying the surface interpolation algorithm 
to the output of the Grimson implementation [Grimson, 1981a, 1981b] of the Marr-Poggio stereo 
theory [Marr and Poggio, 1979] applied to a pair of stereo images. It should be noted that in these ex- 
amples the interpolation algorithm was applied directly to the disparity values obtained by the stereo 
algorithm, without converting them to depth information. As a consequence, the displayed surfaces 
./^ in the figures will not exactly reflect the shape of the surface, since an additional nonlinear transfor- 

mation from disparity to depth is still required. For the purposes of illustrating the interpolation 
algorithm, however, the use of interpolated disparity values suffices, since the interpolation algorithm 
will preserve the general shape of the surfaces (that is, the sign of the surface curvature) as well as the 
relative differences in depth between different surfaces. 

Figure 7 shows four stereo pairs of images, on which the algorithm was tested. Figure 8 shows 
the surface obtained for a wedding cake random dot stereogram. The four planar surfaces arc clearly 
visible, although the effect of a small number of incorrect disparity values at the junctions of adjacent 
planes can be seen. Figure 9 shows the surface obtained for a spiral staircase random dot stereogram. 
Again, while the general shape of the spiral staircase is clearly apparent, the effect of a small number 
of incorrect disparity values can be seen. Figure 10 shows the surface obtained for the natural image 
of a coffee jar. As in the previous cases, the general shape of the surfaces are clearly evident. Not 
only is the jar sharply separated in disparity from the background plane (which is slightly slanted), but 
the overall shape of the jar can be distinguished. Figure 11 shows the surface obtained for the natural 
image of the Moore sculpture. As in the case of Figure 10, the general shape of the surface can be 



EXAMPLES OF INTERPOLATION 



50 




liriiiv ." , 4 Examples of Stereo hrnges. T he figures, from lop to bottom, show she random dot 
sleM" v-iiii oi ;i wielding cake, [lie random dot storco^M -un of a r.piral m ah case, a neutral linage 
of ;i coifee buttle, and a natural image of a sculpture by Henry Moore. 
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Figure 8. The Wedding Cake. The figure shows (he surface obtained by processing the stereo 
paii^ with the Crimson .implementation of the Marr-Poggio stereo algorithm, and interpolating the 
result using the quadratic variation. 
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Figure 9. The Spiral Staircase. The figure shows the sui face obtained by processing the stereo 
pair with the ''Crimson implementation of the Marr-Poggio stereo algorithm, and interpolating the 
result using the quadratic variation. 
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f-uiirc 10. '[lie CoiUv Lu\ The Ftfnrrcs show two views of die suifk'-e chained in processing the 
•'mvu i' M ir o!' ris'iirc 7 v.iih the (Mim^n iiiij *!**mviw:i; U mi oi" l«ie M;iiT-IW;'iu siereo J>*,oii(hm, 
■iin! inli'ipoi.-iniii; ihe K:.uh using the qtuHuiie variation. 
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Figure 11. The Moore Sculpture. The figure shows a view of the surface obtained by processing 
the stereo pair of Figure 7 with the Crimson implementation of the fvlarr-Poggio stereo algorithm, 
and interpolating the result using the quadratic variation. 
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distinguished. Note that because no disparity values can be obtained for the hole in the center of the 
sculpture, the interpolation algorithm interpolates across the hole as if it were a uniform surface. 

6.4 The Conjugate Gradient Approximation Algorithm 

Previous sections have addressed the case of interpolating a surface through the known stereo 
depth values. In this section, the case of approximating a surface relative to the known stereo depth 
values is considered. There are several reasons for considering an approximation of the known depth 
values, rather than an exact fit through them. The first reason is that the accuracy of the stereo data 
may not be sufficient for the purpose of surface approximation. In particular, the algorithm outlined 
for performing the stereo computation yields disparity matches with an accuracy of one picture ele- 
ment. One must consider if such accuracy is sufficient. As well, one must consider the accuracy 
with which the zero-crossing positions reflect the location of a point of interest on an object in the 
scene. Since die operators which extract the zero-crossings have a non-infinitesimal spatial extent, it 
is possible that the zero-crossing positions undergo slight fluctuations in position, such fluctuations 
causing a small error in the dispaiity matches assigned by die algorithm. The second reason is that 
the stereo algorithm does occasionally make an incorrect match If an exact surface interpolation is 
required, such points will incorrectly cause a change in the shape of the surface, and die effect of 
such points can spread over a noticeable region of the surface reconstruction. By requiring a surface 
approximation, the effect of such "bad" disparity points can be minimized. 

The basic notion is to combine a measure of "nearness of fit to the known points" with 
a measure of the consistency of die surface with the zero-crossing information. This can be ac- 
complished by considering a penalty method unconstrained optimization problem. Here, the objec- 
tive function to minimize is 

( 5 ) = J j^sl x + s\ y + 2sl^ dxdy + P J2 (s[x, y) - c(s, y)) 2 

where the summation takes place over the set J of all points in the representation for which there is a 
known stereo depth value c[x f y). The effect of this objective function is to minimize a least-squares 
fit through the known points, scaled relative to the original minimization problem. The constant /? 
is a scale parameter to be determined by the degree of desired fit. Note that the constraints have, in 
this case, been incorporated directly into the objective function. Hence, the objective function may be 
optimized as if it were an unconstrained function. 
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The translation of this problem into the image domain yields the following discrete version of 
the objective function: 
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minimize 
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It is now possible to consider applying the conjugate gradient method to this problem, Recall 
that this method, when applied to the quadratic case, is considered the minimization of 

^s r Qs — b r s. 

In this case, the vector b is given by 

b = — 2/?c 

where c is a vector whose components are the known depth values c^j) if the corresponding grid 
point has such a known value, and otherwise. The matrix Q is given by the discrete stencils outlined 
in the previous section, with an added diagonal factor of 6 S . One can then straightforwardly apply the 
conjugate gradient algorithm, with these forms for b and Q. 

6.5 Examples of Approximation 

The conjugate gradient algorithm, applied to the surface approximation problem, is 
demonstrated by considering a series of examples, illustrated in Figures 12-16, These should be corn- 
pared to Figures 8-11. As in the case of surface interpolation, the surface approximation algorithm 
has been applied to disparity values rather than depth information. Again, the general shape of the 
surface and the relative difference in positions of the surfaces have been preserved by the algorithm., 
although the exact surface shape has not been reconstructed. 
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The objective function of the conjugate gradient algorithm in this case contains two factors, 
the quadratic variation of the surface and a least-squares term embodying a type of "smoothness" 
requirement. The scalar constant (3 determines the relative strengths of these two factors. If we let 
(3 be very small, then the smoothness requirement essentially vanishes and we return to the case of 
surface interpolation, discussed previously. The conjugate gradient algorithm then becomes identical 
to the gradient projection algorithm. If we let (5 become very large, then die quadratic variation factor 
essentially vanishes and the algorithm reduces to a least-squares fitting of a plane to the known points. 
Clearly, we require a value of/? intermediate to these extreme cases. The figures illustrate this tradeoff 
between the two factors, as (3 varies. To determine the optimal value for /?, we require an estimate for 
the density of incorrect disparity values obtained by the stereo algorithm, so that a value for /? may 
be chosen which smooths out the effect of diese incorrect values, while not affecting the shape of the 
surface determined by minimizing the quadratic variation. 
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7. Analysis and Refinements 



7.1 Discontinuities 



One of the implicit assumptions of the interpolation algorithm is that the pieces of surface are 
in fact pieces of a single surface. Of course, this will frequently not be the case. In this section, we 
consider what modifications are necessary in order to account for the existence of several surfaces 
within a scene. In particular, we address the issue of explicitly computing discontinuities in the surface 
representation, and die effects of explicit discontinuities on the form of the reconstructed surface. 

One of the problems associated with the failure to make surface discontinuities explicit is that 
information about the shape of one surface affects the shape of an adjacent surface. This is illustrated 
in Figure 17. A set of known depth points is given in Figure 17(a). Intuitively, the most likely 
surface to fit through these points would be a pair of planes with a discontinuity in depth between 
them, shown in Figure 17(6). However, the requirement that a smooth surface fit through these points 
results in a warping and rippling of the surface that is undesirable, as shown in Figure 17(c). Thus, 
the lack of explicit discontinuities can affect the shapes of the interpolated surfaces in an unacceptable 
/"■"N manner. 
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Figure 12. The Wedding Cake. The figures show surfaces obtained by approximating the surface 
using quadratic variation. The scalar constant relating the least squares term to the quadratic term 
is p = ! in the top figure and p = O.i in the bottom figure. 
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liiUire IV lite Spiral Staircase. The figures show surfaces obtained by approximating the surface 
tisifiL' qnadraoe variation. Hie bralar constant relating (he least squares term lo the quadiaiic term 
is p -~ i in the top figuie and j3 ~- 0.1 in the bottom figure. 
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! igiire 14. The Codec Jar. The figures show" { wo views of a surfaa; o'aameu by approximating 
;!iv a rhse : using quadfatic variation, The scalar constant relating (lie least square:, lemi lo the 
qihulraiic lerni is ft ■■■-■■ 1. 
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bigine 15. The Codec Jar. The figures show two views of a surface obtained by approximating 
the surface using quadratic \atiaUon. "I ho scalar constant relating the lear.t squares tenn to the 
quadratic term is [3 ~~ 0.01. 
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liiiiirc 16. The Moore Seiilpiurc. Tlic hemes show two approximations of the si 
1»\ ir.inr, t|ii:Kir;i(i(.- vnrialion. The- sealar consnml relating Lite len-.i s< ) u : i i ;.-:■: term t< 
term is tl -■■■- O.h in I lie ioj> hgnic anii p ■■•■- 0.0;. in she holloni figure. 
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Retire 17. Discontinuities in the Surfaces. Figure (a) shows a set of known data points. Intuitively, 
iheVoneel reconstructed surface would he a pair of planes, with a discontinuity between them, as 
shown in figure (b), U the interpolation uh'urithm attempts to reconstruct a surface through the 
boiindaiy points, without a discontinuity, the result is as shown in figuic <d The sharp change 
in depth results in a rippling of the surface. 
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In order to make discontinuities explicit, there are several questions to ask about the process. 
How are the discontinuities detected? Where are they placed in the representation? When does the 
detection of discontinuities take place in the overall interpolation process? In the next few sections, 
we will discuss two possible methods for detecting the discontinuities, and their role in the overall 
interpolation. 

7.1.1 Occlusions in the Stereo Algorithm 

Consider the geometry indicated in Figure 18. There are regions of the left image which will not 
have a corresponding region in the right image, and vice versa. Consequently, any zero-crossings in 
this portion of one image will have no counterpart in the other image, and the stereo algorithm should 
not assign any match to such zero-crossings. Hence, one possible mechanism for detecting occlusions 
would be to search for portions of the image which contain unmatched zero-crossings. Then, the 
interpolation can be restricted to take place only over those sections of the image which are bounded 
by zero-crossings with known disparity values. 

This method would detect the discontinuities before the interpolation, since it uses stereo in- 
formation directly to locate the occlusions. A problem with the method is that it will not detect 
all discontinuities, only those in the horizontal direction. Discontinuities that occur in the vertical 
direction do not cause occlusions. Hence, any method for detecting discontinuities which relies only 
on the unmatched zero-crossings will be incomplete. 

7.1.2 The Primal Sketch Revisited 

An integral part of most computational theories, proposed as models of aspects of the human 
visual system, is the use of computational constraints based on assumptions about the physical world 
[Marr, 1976, 1980; Marr and Poggio, 1979; Marr and Hildreth, 1980; Ullman, 1979], [n some of the 
computational theories, the constraints are explicitly checked for validity within tlie algorithm (e.g. 
Ullman's rigidity constraint in recovering staicture from motion). In others, the constraints are simply 
assumed to be true, and arc not explicitly checked (e.g. Marr and Poggio's uniqueness constraint in 
stereopsis). Can any aspect of the surface consistency constraint be explicitly checked and used by the 
algorithm? 

The basic notion of the surface consistency constraint is that the surface cannot undergo a radi- 
cal change in shape without having an accompanying /em-crossing in the convolved image. Implicit 
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Figure 18. Occlusions. The upper surface occludes portions of the lower surface in each eye. 
These portions are different for the two eyes. The cross-hatched area of the lower surface indicates 
the region of the surface visible to the left eye, but not to the right. 



in this constraint is the assumption that the portion of the image being examined in fact corresponds 
to a single object. Thus, one could propose that if the shape of the interpolated surface forces a zero- 
crossing in a location for which none exists in the Primal Sketch, then such a zero-crossing indicates 
a location at which the assumption of a single object is violated. Such zero-crossings could then be 
taken as indicative of a surface discontinuity. 

Perhaps the simplest method of detecting such discontintuities is again to use ideas inherent in 
the Primal Sketch. Recall that the Primal Sketch created descriptions of points in the image associated 
with inflections in intensity, for a range of resolutions. Since the image intensities may be considered 
as a type of three-dimensional surface, the Primal Sketch operators essentially detect discontinuities 
in the image intensities for a range of resolutions. Thus, one could apply the same type of analysis 
to the detection of surface discontinuities, where now the surface on which the operators apply is the 
reconstructed depth surface, rather than the intensity surface. 
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It is worth noting that not only should the operators be of the form used in the extraction 
of the Primal Sketch/but that it may also be useful to use a range of operators, as in the Primal 
Sketch. One reason for using multiple zero-crossing detectors was that surface changes, and hence 
intensity changes, could take place over a wide range of scales. This is still true in the case of 
surface descriptions, such as have been constructed for the coffee jar or the wedding cake. Thus, 
surface discontinuities corresponding to occluding edges will frequently tend to correspond to large 
surface changes, while internal surface discontinuities, due to a warping of the surface, will tend 
to correspond to small surface changes. By using a range of V 2 G operators, one can extract both 
occluding contour discontinuities, as well as ripples or warpings of the surface itself. 

Note that this method requires that the surface interpolation already take place, before it can be 
applied. Since one of the general requirements on an algorithm is that it be rapid, we must consider 
the consequence of detecting discontinuities after the interpolation of the surfaces. There are two 
main reasons for the explicit detection of discontinuities. One is that such an explicit representation 
of this information will allow higher level processes, such as recognition, or extraction of axes for 
three-dimensional shape analysis, to operate more easily, since the process serves to make implicit 
information explicit. However, a second reason is to create more accurate surface representations, by 
removing the type of effect illustrated in Figure 17(c). If the process used to isolate discontinuities 
takes place after interpolation, and if the interpolation process requires the discontinuities to improve 
the interpolated surface approximation, one must propose an interpolater which passes over the sur- 
face information twice; first to produce an initial description, and second to refine the description 
after the detection of discontinuities. One must then question whether such a two pass process will 
affect our constraint of rapid algorithms. Fortunately, the answer is no, since the surface approxima- 
tion obtained without explicitly accounting for the discontinuities is very close to the limiting surface 
except in the areas of the discontinuities (that is, any effects of flic discontinuities are quickly damped 
out as one moves across the surface). Thus, the initial starting position for the second pass of the 
interpolation algorithm is very close to the limiting surface, and only a few iterations will be needed to 
refine the surface approximation. 
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7.1.3 Interpolation Over Occluded Regions 

Even though occluded regions of the image can only be viewed from one eye, the human system 
still associates a depth value with these regions. This has an interesting implication for the interpola- 
tion algorithm. For most occluded regions, the only depth information available is at the edges of the 
occluded region. Psychophysical experiments have shown that the occluded region is always perceived 
at the depth of the lower surface. Thus, in Figure 18, the occluded region would be perceived at the 
level of the lower surface. Note that this is consistent with the physics of the situation, since if the 
occluded region were perceived at the level of the upper surface, then it should in fact be visible to the 
right eye, and this is not the case. 

This observation suggests that when an occlusion is detected, it is explicitly located along the 
occluding boundary corresponding to the edge of the nearer object. This allows the occluded region 
itself to be associated with the lower surface, and the interpolation algorithm will fill in surface values 
for the occluded region from this lower surface. 

This raises an interesting psychophysical prediction. The psychophysical literature has examined 
the case of planar surfaces and their occlusions, as in Figure 18. If the interpolation method developed 
here is given an explicit discontinuity along one edge of the occluded region, it will correctly fill in 
the region as an extension of the lower plane. Of interest is the case in which the occluded region is 
not planar. For example, consider a cylindrical object. If the interpolation algorithm is given this type 
of input, it will fill in the occluded portions of the surfaces as a smooth continuation of the curved 
cylinder. If the interpolation algorithm correctly models interpolation by the human visual system, 
then this predicts that die surface perception for human observers in this situation should also be that 
of a smooth cylinder. While informal experiments indicate that this is true, the prediction has not yet 
been rigourously tested psychophysical^. 

7.2 Noise Removal 

Although in general the Marr-Poggio stereo algorithm is very good at matching zero-crossings 
correctly (especially for random dot patterns), incorrect disparity values may sometimes be assigned to 
regions of the image. These incorrect values can be considered as noise superimposed on the correct 
sui face. Since the surface interpolator explicitly attempts to fit a surface through all the disparity 
points, such noise points can affect the shape of the surface approximation. Indeed, the effect of these 



^~v 



ACUITY 68 

noise points can spread over a noticeable portion of the surface, before the nearby disparity values can 
damp out its effect. Thus, it would be preferable to remove these noise points, or at least neutralize 
their effect on the approximated surface shape. One possibility is that if a two pass interpolator is 
used, as suggested in the previous section, the detection of surface discontinuities will isolate such 
noise points from the rest of the surface, and 'the second pass of the interpolator will adjust the surface 
approximation to remove the influence of the noise points on the first pass approximation. Certainly 
this will be true for noise points with disparity values far removed from the correct values. For 
noise points whose disparity values are only slightly different from the correct surface disparities, the 
difference does not really matter. However, the final result would be that the noise points, while being 
isolated from the rest of the correct surface, would still remain in the final surface description. It 
would be preferable to completely remove such points. 

Is it possible to identify and remove noise points from the disparity map? If the noise points are 
isolated spatially, then it is possible to identify them as undesirable, This follows from the form of the 
primal sketch operators. The case to consider is that in which one must distinguish between a set of 
noise points in a disparity map and a small object separated in depth from the rest of the scene. For 
the small object, the size of the zero-crossing contour is limited by the size of the available operator, 
and hence there is a minimum size of zero-crossing contour which the operator will yield about the 
object. If the number of zero-crossing points which differ significantly from their neighbors is less 
than this minimum, one may conclude that the points are noise, and thus remove them. This will 
result in an improved surface approximation. 

7.3 Acuity 

It can be seen from the example of the interpolated coffee jar in Figure 10, that the interpolated 
surface contains a bumpy quality which clearly is not consistent with die original object. How can 
this be explained? The effect occurs in part because the disparity values are specified only to within a 
pixel. This yields a fairly coarse disparity map which results in the bumps observed in the interpolated 
coffee jar of Figures 14 and 15, Hence, one method of removing the bumps would be to improve 
the accuracy of the disparities obtained by the algorithm. Note that some improvement in disparity 
accuracy is necessary if- the algorithm is to be consistent with the human system. If we roughly equate 
pixels with receptors, then a pixel corresponds to roughly 27 seconds of arc. The implementation of 
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the stereo algorithm computed disparity to within a pixel, while humans are capable of stereo acuity 
to a resolution of 2 — 10 seconds [Howard, 1919; Woodburne, 1934; Berry, 1948; Tyler, 1977]. 

In order to account for finer disparity values, it is necessary to localize the zero-crossing to a bet- 
ter accuracy than. has been done so far. Since the convolution values are only specified at each pixel, 
one method for more accurately specifying the zero-crossing positions is to interpolate between the 
known convolution values [Crick, Marr and Poggio 1980, Marr, Poggio and Hildreth 1979, Hildreth 
1980]. Perhaps the simplest method is to rely on the observation of Hildreth that for most cases, 
even a simple linear interpolation will give extremely accurate localization of the zero-crossings. The 
addition of finer resolution depth information may improve the performance of the algorithm. 

This example also raises a question of scale. Depending on the application of the surface 
specification, different amounts of resolution may be required. For example, if the ultimate goal of 
the surface specification is to obtain a rough idea of the position and shape of the surfaces in a scene, 
the spatial resolution at which surface information must be made explicit may not be critical. In this 
case, the known data from the stereo algorithm may be sampled at a coarser resolution, before the 
^ interpolation takes place. This should result in a smoother surface approximation. Further, although 

the reconstructed surface is less exact in terms of fine variation of the surface shape, the overall shape 
of the bottle is still preserved in this interpolation. 

7.4 Psychophysics 

We close by listing a series of psychophysical questions of relevance to the interpolation process. 

(1) What is the form of the surface perceived in occluded regions? In particular, the minimization of 
quadratic variation suggests that if a portion of a curved object is occluded, then the surface in 
the occluded region should also be curved, and should minimize the quadratic variation across 
that region. 

(2) Figure 17 suggests that if discontinuities are not explicitly demarked in the interpolation 
process, a warping of die reconstructed surface (similar to Gibb's phenomena) will result. 
While, in principle, such ripples in the surface arc* undesirable, it is worth asking whether the 
human system specifically accounts for discontinuities before interpolation occurs. This may be 
rephrased by asking whether in stereoscopic situations similar to Figure 17, we perceive a Mach 

/****\ hand-like warping of the surface in depth? 
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(3) We have suggested that there are several possible functional whigh could be used to deteniiine 
the most consistent surface. Based on algorithmic and mathematical arguments, we choose the 
quadratic variation. Can we test the shape of the reconstructed surface psychQphysicrally? In 
particular, can we distinguish psychophysically between the minimum surface under quadratic 
variation and the minimum surface under some other functional, such as the square Laplacian? 
Is the reconstructed surface psychophysically consistent with the surface computed by quadratic 
variation? 

(4) What is the spatial resolution of die reconstructed surface? That is, what is the spacing of the 
grid upon which the values of the reconstructed surface are computed? 

The answers to these questions will help verify or correct die theory of visual surface; interpola- 
tion developed in this paper. 

8. Summary 

Computational theories of motion perception [Ullman, 1979] and stereo vision [Marr and 
Poggio, 1979] can only specify the computation of three-dimensional surface information at special 
points in the image. In order to account for the visual perception of complete surfaces, we have 
developed a computational theory of the interpolation of surfaces from visual information. 

The problem is constrained by the fact that the surface must agree with the information from 
stereo or motion correspondence, and not vary radically between these points. In Grimson [1981c], an 
explicit form of this surface consistency constraint is derived from the image intensity equation [Horn, 
1975]. The main point of the surface consistency constraint is that it requires the interpolated surface 
to vary as little as possible. 

To determine which of .two possible surfaces is more consistent with the surface consistency 
constraint, one must be able to compare the two surfaces. To do this, a functional from the space 
of functions to the real numbers is required, where the functional should measure some function qf 
the variation in the surface. In .this way, the surface most consistent with the visual infbnnation will 
be that \vhich minimizes the functional. To ensure that the functional has a upiqiie minimal surface, 
conditions on the form of the functional arc. derived. In particular, if the junctional js a cqpiplete 
semi-norm which satisfies the parallelogram law, or the space of functions is a seiriiT filbert space and 
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the functional is a semi-inner product, then there is a unique (to within an element of the null space of 
the functional) surface which is most consistent with the visual information. 

It can be shown, based on the above conditions plus a condition of rotational symmetry, that 
there is a vector space of possible functional which measure surface consistency, this vector space 
being spanned by the functional of quadratic variation and the functional of square Laplacian (Brady 
and Horn, 1981). Arguments based on the null spaces of the respective functionals were used to justify 
the choice of the quadratic variation as the optimal functional. 

Algorithms for computing the surface which minimizes quadratic variation in the case of exact 
surface interpolation and in the case of surface approximation were outlined and illustrated on a series 
of synthetic and actual surface interpolation examples. 
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